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I. Inrropuction 


Viruses are multimolecular assemblies that range from small, regu- 
lar, and simple to large, pleiomorphic, and complex. They consist of 
virus-specified proteins and nucleic acids and, in the case of enveloped 
viruses, of host-derived lipids. In infected cells the assembly of these 
different components into virions occurs with high precision amidst a 
huge background of tens of thousands of host compounds. Two key 
factors determine the efficiency of the assembly process: intracellular 
transport and molecular interactions. 

Directional transport ensures the swift and accurate delivery of 
the virion components to the cellular compartment(s) where they must 
meet and form (sub)structures. Some viruses achieve this goal rel- 
atively simply when genome production occurs in close proximity 
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to the virion assembly site (e.g., picornaviruses). Many viruses, how- 
ever, have evolved more elaborate strategies. This is illustrated, for 
instance, by the «-herpesviruses. Assembly of these viruses starts in 
the nucleus by the encapsidation of viral DNA, using cytoplasmically 
synthesized capsid proteins; nucleocapsids then migrate to the cyto- 
sol, by budding at the inner nuclear membrane followed by deenvelop- 
ment, to pick up the tegument proteins. Subsequently, the tegumented 
capsids obtain their final envelope by budding into vesicles of the trans- 
Golgi network (TGN), where the viral envelope proteins have congre- 
gated after their synthesis in the endoplasmic reticulum; the assembled 
viral particles are finally released by fusion of the virion-containing 
vesicles with the plasma membrane. To achieve their transport goals 
viruses provide their components with address labels that can be read 
by the transport machinery of the cell. Once brought together, forma- 
tion of the viral (sub)structures is governed and driven by their 
interactions. Whereas the assembly of nonenveloped viruses is gener- 
ally restricted to the cell cytoplasm, although often in association 
with membranes, that of enveloped viruses involves multiple cellular 
compartments, as exemplified already for herpesviruses. 

This review deals with the assembly of coronaviruses. We first 
describe what is known about the structure of the coronavirion and 
about the relevant properties of the structural components. We summa- 
rize the limited ultrastructural information about coronavirus assem- 
bly and budding. The main body of the review describes the interactions 
between the different structural components of the viruses and dis- 
cusses their relevance for the process of virion formation. This review 
has a limited scope; for further information about other aspects of coro- 
navirus biology the reader is referred to other reviews (de Vries et al., 
1997; Enjuanes et al., 2001; Gallagher and Buchmeier, 2001; Holmes, 
2001; Holmes e¢ al., 2001; Lai, 1997; Lai and Cavanagh, 1997; Lai 
et al., 1994; Masters, 1999; Perlman, 1998; Rossen et al., 1995; Sawicki 
and Sawicki, 1998; Siddell, 1995; Ziebuhr et al., 2000). 


II. SrrucTuRE OF THE CORONAVIRION AND Its COMPONENTS 


Coronaviruses are a group of enveloped, plus-stranded RNA viruses 
presently classified as a genus, which, together with the genus 
Torovirus, constitutes the family Coronaviridae. These viruses are 
grouped with two other families, the Arteriviridae and the Roniviridae, 
into the order Nidovirales. This classification is not based on structural 
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similarities—in fact, structure and composition of the viruses from 
the different families differ significantly—but on common features of 
genome organization and gene expression (de Vries et al., 1997; Lai and 
Cavanagh, 1997). 

Coronaviruses infect a wide variety of mammals as well as avian 
species (Table I). In general they cause respiratory or intestinal 
infections, but some coronaviruses can also infect other organs 
(liver, kidney, and brain). Until recently, these viruses were mainly 
of veterinary importance. This situation has changed quite dramati- 
cally because of the emergence of severe acute respiratory syndrome- 


TABLE I 
Coronavirus Groups, THEIR MaIn Representatives, Hosts, AND 
Principat Associaten Diskases 


Group Virus Host Disease 
1 Feline coronavirus (FCoV) Cat Respiratory infection/enteritis/ 
peritonitis/systemic enteritis 
Canine coronavirus (CCoV) Dog _—_Enteritis 
‘Transmissible gastroenteritis Pig Enteritis 
virus (TGEV) 
Porcine epidemic diarrhea Pig Enteritis 
virus (PEDV) 
Porcine respiratory Pig Respiratory infection 
coronavirus (PRCoV) 
Human coronavirus Human Respiratory infection 
(HCoV)-NL63 
Human coronavirus Human Respiratory infection 
(HCoV)-229E, 
2 ‘Murine hepatitis virus (MHV) Mouse Respiratory infection/enteritis/ 
hepatitis/encephalitis 
Rat coronavirus (RCoV) Rat Respiratory infection 
Bovine coronavirus (BCoV) Cow Respiratory infection/enteritis 
Hemagglutinating Pig Enteritis 


encephalomyelitis virus (HEV) 
Human coronavirus (HCoV)-OC43 Human Respiratory infection 


3 Infectious bronchitis virus (IBV) Chicken Respiratory infection/enteritis 
Turkey coronavirus (TCoV) Turkey Enteritis 
2 Severe acute respiratory Human Respiratory infection/enteritis 


syndrome-associated coronavirus 
(SARS-CoV) 
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associated coronavirus (SARS-CoV) in late 2002, which emphasized 
the potential relevance of coronaviruses for humans. On the basis of 
antigenic and genetic relationships the coronaviruses have been 
subdivided into three groups (Table I); the taxonomic position of 
SARS-CoV has not been formally assigned. 


A. Coronavirion 


Coronavirus particles have a typical appearance under the electron 
microscope. By the characteristic, approximately 20-nm-long spikes 
that emanate from their envelope the viruses acquire the solar image 
to which they owe their name (Fig. 1). The 80- to 120-nm virions have a 
pleiomorphic appearance that, whether artifact or real, reflects a 
pliable constellation, a feature that has severely hampered the ultra- 
structural analysis of these viruses. Hence, our knowledge about the 
structure of coronaviruses is still rudimentary. 

The schematic representation of the current model of the coronavi- 
rion drawn in Fig. 1 is based on morphological and biochemical 


Fic 1. Electron micrographs of mouse hepatitis virus strain A59 (MHV-A59) virions 
without (A) and with (B) the hemagglutinin-esterase (HE) envelope protein (viruses 
kindly provided by R. de Groot, Virology Division, Utrecht University, The Netherlands; 
image courtesy of J. Lepault, VMS-CNRS, Gif-sur-Yvette, France). Large, club-shaped 
protrusions consisting of spike (S) protein trimers give the viruses their corona solis-like 
appearance. Viruses containing the HE protein display a second, shorter fringe of sur- 
face projections in addition to the spikes. (C) Schematic representation of the coronavir- 
ion. The viral RNA is encapsidated by the nucleocapsid (N) protein forming a helical 
ribonucleoprotein (RNP), which is in turn part of a structure with spherical, probably 
icosahedral, configuration. The nucleocapsid is surrounded by a lipid bilayer in which the 
S protein, the membrane glycoprotein (M), and the envelope protein (E) are anchored. In 
addition, some group 2 coronaviruses contain the HE protein in their lipid envelope as 
illustrated on the right side of the particle. 
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observations. As this picture illustrates, the particle consists of a 
nucleocapsid or core structure that is surrounded by a lipid envelope. 
Anchored in this envelope are the three canonical coronavirus mem- 
brane proteins: the membrane (M) protein, the envelope (E) protein, 
and the spike (S) protein. Viruses from group 2 have an additional, 
fourth membrane protein, the hemagglutinin-esterase (HE) protein. 
As a consequence these viruses display a second, shorter (5 nm) fringe 
of surface projections in addition to the spikes (Fig. 1B) (Bridger et al., 
1978; King et al., 1985; Sugiyama and Amano, 1981). 

The ribonucleoprotein (RNP) core contains one copy of the viral 
genomic RNA. This RNA is packaged into a helical structure by multi- 
ple copies of nucleocapsid protein (N). Size estimations of the flexible 
cylindrical structures varied quite considerably, ranging between 
7 and 16 nm in diameter and up to 0.32 um in length (see Laude and 
Masters, 1995). The ribonucleoprotein helix appears in turn to be 
contained within a spherical, probably icosahedral, configuration as 
indicated by various ultrastructural approaches using purified trans- 
missible gastroenteritis virus (TGEV) and mouse hepatitis virus 
(MHV) (Risco et al., 1996, 1998). 

The molar ratio of the major structural proteins, S:N:M, has been 
variously estimated to be approximately 1:8:16 (Sturman et al., 1980), 
1:6:15 (Cavanagh, 1983a), 1:8:8 (Hogue and Brian, 1986), and 1:11:10 
(Liu and Inglis, 1991), although an M:N molar ratio of 3 has also been 
reported (Escors et al., 2001a). The S:HE molar ratio was estimated to 
be 4 (Hogue and Brian, 1986). The E protein is only a minor virion 
component and was calculated to occur in infectious bronchitis virus 
(IBV), TGEV, and MHV virions at a rate of approximately 100, 20, and 
10 molecules per particle, respectively (Godet et al., 1992; Liu and 
Inglis, 1991; Vennema e¢ al., 1996). 

The lipid composition of coronaviral envelopes has been studied only 
to a limited extent. Comparison of the phospholipid composition of 
MHV with that of its host cell showed increased levels of sphingomye- 
lin, phosphatidylserine, and phosphatidylinositol and a decrease in 
the level of phosphatidylethanolamine (van Genderen et al., 1995). 
Whether the lipid composition of MHV is an accurate reflection of 
its budding compartment or whether certain lipids become enriched 
in the virus during budding is not known. 

What follows is a general description of the individual virion compo- 
nents and their properties. This description is by no means complete as 
it is restricted to the information that is of relevance to the main topic 
of this review. For a schematic representation of the coronavirus life 
cycle see Fig. 2. 
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Fic 2. The coronavirus life cycle. The replication cycle starts with attachment of the 
virion by its S protein, that is, through the S1 subunit thereof, to the receptors on the 
host cell. This interaction leads to fusion of the virus envelope with a cellular membrane, 
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B. Viral Genome 


Coronaviruses contain a_ single-stranded positive-sense RNA 
genome of some 27 to 31 kilobases, the largest nonsegmented viral 
RNA genomes known. The RNA has a 5/-terminal cap and a 3/- 
terminal poly(A) tract. Both genomic termini contain untranslated 
regions (UTRs) of some 200-500 nucleotides that harbor several 
cis-acting sequences and structural elements functioning in viral rep- 
lication and transcription. Coronaviruses have a typical genome orga- 
nization characterized by the occurrence of a distinctive set of genes 
that are essential for viability and occur in a fixed order: 5/-polymerase 
(pol)-S-E-M-N-3! (Fig. 3). The pol gene comprises approximately two- 
thirds of the genome, from which it is translated directly. It encodes 
two large precursors (Polla and Pollab), the many functional cleavage 
products of which are collectively responsible for RNA replication 
and transcription (for reviews on coronavirus transcription and repli- 
cation see de Vries et al., 1997; Lai, 1997; Lai and Cavanagh, 1997; Lai 
et al., 1994; Sawicki and Sawicki, 1998; Ziebuhr et al., 2000). The more 
downstream pollb gene is translated by translational readthrough, 


~ 
for which the S2 subunit is responsible. From the genomic RNA that is released by 
disassembly of the incoming particle the pola and pol1b genes are translated, resulting 
in the production of two large precursors (Polla and Pollab), the many cleavage products 
of which collectively constitute the functional replication-transcription complex. Genes 
located downstream of the pol/b gene are expressed from a 3'-coterminal nested set of 
subgenomic (sg) mRNAs, each of which additionally contains a short 5! leader sequence 
derived from the 5! end of the genome (shown in red). Transcription regulatory sequences 
(TRSs) located upstream of each gene serve as signals for the transcription of the 
sgRNAs. The leader sequence is joined at a TRS to all genomic sequence distal to that 
TRS by discontinuous transcription, most likely during the synthesis of negative-strand 
sgRNAs. In most cases, only the 5'-most gene of each sgRNA is translated. Multiple 
copies of the N protein package the genomic RNA into a helical structure in the cyto- 
plasm. The structural proteins S, M, and E are inserted into the membrane of the rough 
endoplasmic reticulum (RER), from where they are transported to the ER-to-Golgi 
intermediate compartment (ERGIC) to meet the nucleocapsid and assemble into parti- 
cles by budding. The M protein plays a central role in this process through interactions 
with all viral assembly partners. It gives rise to the formation of the basic matrix of 
the viral envelope generated by homotypic, lateral interactions between M molecules, 
and it interacts with the envelope proteins E, S, and HE (if present), as well as with 
the nucleocapsid, thereby directing the assembly of the virion. Virions are transported 
through the constitutive secretory pathway out of the cell—the glycoproteins on their 
way being modified in their sugar moieties, whereas the S proteins of some but not 
all coronaviruses are cleaved into two subunits by furin-like enzymes (see text for 
references). 
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Fic 3. Coronavirus genome organization as illustrated for the group 2 virus MHV. 
The single-stranded, positive-sense RNA genome contains 5’- and 3'-terminal untrans- 
lated regions (UTRs) with a 5'-terminal cap and a 3'-terminal poly(A) tract. The leader 
sequence (L) in the 5’ UTR is indicated. All coronaviruses have their essential genes in 
the order 5'-pol-S-E-M-N-3'. The polla and pol1b genes comprise approximately two- 
thirds of the genome. The more downstream pollb gene is translated by translational 
readthrough, using a ribosomal frameshift mechanism. Transcription regulatory 
sequences (TRSs) located upstream of each gene, which serve as signals for the tran- 
scription of the subgenomic (sg) RNAs, are indicated by circles. The genes encoding the 
structural proteins HE, S, E, M, and N are specified. Gray boxes indicate the accessory, 
group-specific genes, in the case of group 2 coronaviruses genes 2a, HE, 4, 5a, and I. 


using a ribosomal frameshift mechanism for which a “slippery” 
sequence and a pseudoknot structure are required. 

The genes located downstream of pollb are expressed from a 
3/-coterminal nested set of subgenomic (sg) RNAs, each of which 
additionally contains a short 5’ leader sequence derived from the 5! 
end of the genome. Transcription regulatory sequences (TRSs) located 
upstream of each gene serve as signals for transcription of the sgRNAs. 
The leader sequence is joined at a TRS to all genomic sequence distal 
to that TRS by discontinuous transcription, most likely during the 
synthesis of negative-strand sgRNAs (Sawicki and Sawicki, 1998). 

Besides the characteristic genes encoding the replicative and struc- 
tural functions, coronaviruses have a more variable collection of addi- 
tional genes that are located in two clusters in the 3/-terminal one- 
third of the genome. The genes differ distinctly in their nature and 
genomic position among the coronavirus groups, but they are specific 
for each group. These so-called group-specific genes appear not to be 
essential as shown by the occurrence of natural mutants defective in 
some of them (Brown and Brierley, 1995; Herrewegh et al., 1995; 
Kennedy et al., 2001; Luytjes, 1995; Shen et al., 2003; Vennema, 
1999; Vennema ez al., 1998; Woods, 2001) and by the observed viability 
of engineered deletion mutants lacking some or all of these genes 
(de Haan et al., 2002b; Fischer et al., 1997; Haijema et al., 2004; Ortego 
et al., 2003; Sola et al., 2001). Except for the group 2-specific HE 
protein and, possibly, the poorly characterized I protein (Fischer 
et al., 1997; Senanayake et al., 1992), the latter encoded by an open 
reading frame completely contained within the N gene, the group- 
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specific proteins do not appear to occur in virions. Although their 
functions have not yet been resolved, mutant studies indicate that 
they play important roles in the interaction of coronaviruses with their 
host (de Haan et al., 2002b; Fischer et al., 1997; Haijema et al., 2004; 
Ortego et al., 2003). 


C. N Protein 


The N protein is the most abundantly expressed viral protein in 
infected cells (for a review, see Laude and Masters, 1995). Its size 
varies considerably between viruses from different groups (377-455 
amino acids, i.e., molecular masses ranging between 45 and 60 kDa), N 
proteins from group 2 coronaviruses (Table I) being the largest. Where- 
as the amino acid sequences of N proteins are quite similar within the 
groups, the homology between proteins from different coronavirus 
groups is rather limited (30-35%). An exception is a region spanning 
about 50 residues within the amino-terminal one-third of the N mole- 
cule, where high sequence identity has been conserved across the 
different groups. 

Despite the overall sequence variation the N proteins have a number 
of common characteristics. Consistent with their role as nucleic acid- 
binding proteins they are all highly basic because of the abundance of 
arginine and lysine residues. These are clustered mainly in two nearby 
regions in the middle of the molecules. The abundance of basic residues 
is reflected in the calculated overall isoelectric points of the N proteins, 
the values of which are in the range of 9.7—10.1. These numbers are the 
more significant in view of the acidic nature of the very carboxy- 
terminal domain; p/ values ranging from 4.3 to 5.5 were calculated 
for the terminal 45 residues (Parker and Masters, 1990). Another 
general characteristic of the N proteins is their high content (7—11%) 
of serine residues, which are potential targets for phosphorylation. 
Although these residues occur all over the N molecule, their relative 
abundance within the first of the two basic regions is notable. 

Little is known about the three-dimensional structure of the N 
protein. Of the SARS-CoV N protein the amino-terminal domain 
(residues 45-181) was analyzed by nuclear magnetic resonance spec- 
troscopy. It appeared to consist of a five-stranded f sheet. with a folding 
distinct from that of other RNA-binding proteins (Huang et al., 2004). 

In coronavirus-infected cells the N protein can often be detected as 
one major and several minor forms, the latter polypeptides having a 
slightly lower molecular weight. The major species appeared to comi- 
grate in gels with the N protein observed in virions, indicating that 
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only the full-length N species is incorporated into particles. How the 
minor N species arise and whether they are of particular significance 
for infection is unclear. They are most likely derived by proteolytic 
processing from the major N species. This is supported by studies from 
Eleouet et al. (2000), who showed the TGEV N protein to be cleaved by 
caspases. Caspase cleavage sites were also predicted in the carboxy 
terminus of several other coronavirus N proteins (Eleouet et al., 2000; 
Ying et al., 2004). These features are in agreement with observations 
showing that antibodies directed against the carboxy terminus of 
the MHV and TGEV N proteins were not reactive with the faster 
migrating electrophoretic forms. Furthermore, these smaller N protein 
forms appeared to be derived from the major species as judged from 
pulse-chase analyses (for a review see Laude and Masters, 1995). 

The N protein is the only coronavirus structural protein known to 
become phosphorylated (for references see Laude and Masters, 1995). 
Both the major and minor N species appear to be phosphorylated as 
shown for MHV-A59 in Sac(—) cells (Rottier et al., 1981b) and for TGEV 
in LLC-PK1 cells (Garwes e¢ al., 1984). Of the many potential target 
serines only a few are actually modified in the case of MHV (Stohlman 
and Lai, 1979; Wilbur et al., 1986). N protein phosphorylation does not 
seem to play a critical role in the regulation of virus assembly. In 
contrast, it has been hypothesized that dephosphorylation of the pro- 
tein might facilitate disassembly during MHV cell entry (Kalicharran 
et al., 1996; Mohandas and Dales, 1991). 

Immunofluorescence microscopy has shown the N protein to be 
localized in a particulate manner throughout the cytoplasm of corona- 
virus-infected cells. Although the protein lacks a membrane-spanning 
domain it was found in association with membranes (Anderson and 
Wong, 1993; Sims et al., 2000; Stohlman e¢ al., 1983). For MHV, the 
N protein was found to colocalize partly with the membrane-associated 
viral replication complexes (Denison e¢ al., 1999; van der Meer et al., 
1999). In addition to its cytoplasmic localization, the N proteins of 
IBV, MHV, and TGEV have also been demonstrated to localize to 
the nucleolus both in coronavirus-infected cells and when expressed 
independently (Hiscox et al., 2001; Wurm et al., 2001). Putative nucle- 
ar localization signals were identified in these proteins. The IBV N 
protein was found to interact with nucleolar antigens, which appeared 
to occur more efficiently when the N protein was phosphorylated, 
and to affect the cell cycle (Chen e¢ al., 2002). However, because 
MHV is able to replicate in enucleated cells (Brayton et al., 1981; 
Wilhelmsen e¢ al., 1981) the nucleolar localization of the N protein 
does not appear an essential step during infection. 
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Although the primary function of the N protein is the formation 
of the viral ribonucleoprotein complex, several studies indicate the 
protein to be multifunctional. As indicated by its intracellular locali- 
zation, the N protein is a likely component of the coronavirus replica- 
tion and transcription complex. Its presence is not an absolute 
requirement for replication and transcription because a human coro- 
navirus (HCoV) RNA vector containing the complete pollab gene 
appeared to be functional in the absence of the N protein (Thiel e¢ al., 
2003). However, the efficiency of the system was much enhanced when 
the protein was present. Furthermore, using an in vitro system, it was 
demonstrated that antibodies to the N protein, but not those against 
the S and M proteins, inhibited viral RNA synthesis by 90% (Compton 
et al., 1987). Interactions that have been observed between the 
N protein and leader/TRS sequences (Baric et al., 1988; Nelson e¢ al., 
2000; Stohlman ez al., 1988) and between N protein and the 3’ UTR 
(Zhou et al., 1996) suggest a role for the N protein in the discontinuous 
transcription process. Furthermore, the N protein was also shown to 
interact with cellular proteins that play a role in coronavirus RNA 
replication and transcription (Choi et al., 2002; Shi et al., 2000). 
In addition, the N protein was reported to function as a translational 
enhancer of MHV sgRNAs (Tahara et al., 1998). 


D. M Protein 


The M protein (previously known as E1 protein) is the most abun- 
dant envelope protein. It is the “building block” of the coronavirion and 
has been shown to interact with virtually every other virion compo- 
nent, as detailed in Section IV. The M protein is 221-230 residues in 
length, except for the group 1 M proteins, of which the amino terminus 
is about 30 residues longer. Despite large differences in primary 
sequences between M proteins from different antigenic groups, their 
hydropathicity profiles are remarkably similar. The M protein is highly 
hydrophobic. It has three hydrophobic domains alternating with short 
hydrophilic regions in the amino-terminal half of the protein, with the 
exception of the aforementioned group 1M proteins, which have at 
their amino terminus a fourth hydrophobic domain that functions as 
a cleavable signal peptide. The carboxy-terminal half of the protein is 
amphipathic, with a short hydrophilic domain at the carboxy-terminal 
end (Fig. 4). In the center of the protein, directly adjacent to the third 
hydrophobic domain, is a stretch of eight amino acids that is well 
conserved (SWWSFNPE). The conservation of the overall chemical 
features suggests that there are rigid structural constraints on the 
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Fic 4. Membrane topology of the coronavirus envelope proteins. The HE and S pro- 
teins are both type I membrane proteins, with short carboxy-terminal cytoplasmic tails. 
The HE protein forms disulfide-linked homodimers, whereas the S protein forms non- 
covalently linked homotrimers. The S1 subunits presumably constitute the globular 
head, whereas the S2 subunits form the stalk-like region of the spike. The M protein 
spans the lipid bilayer three times, leaving a small amino-terminal domain in the lumen 
of intracellular organelles (or on the outside of the virion), whereas the carboxy-terminal 
half of the protein is located on the cytoplasmic side of the membrane (or inside the 
virion), In TGEV virions some of the M proteins have their cytoplasmic tail exposed on 
the outside (not shown). The M protein is glycosylated at its amino terminus (indicated 
by a diamond). The amphipathic domain of the M protein is represented by an oval. The 
hydrophilic carboxy terminus of the E protein is exposed on the cytoplasmic side of 
cellular membranes or on the inside of the virion. The E protein may span the bilayer 
once (b) or twice (a). 


M protein as a result of functional requirements (for a review on the M 
protein, see Rottier, 1995). 

Biochemical and theoretical studies led to a topological model for the 
MHV M protein (Armstrong et al., 1984; Rottier et al., 1984, 1986), in 
which the polypeptide spans the lipid bilayer three times, leaving 
a small amino-terminal domain (15-35 residues) in the lumen of 
intracellular organelles (or outside the virus), whereas the carboxy- 
terminal half of the protein is located on the cytoplasmic side of 
the membrane (or inside the virion). The lumenal domain and the 
hydrophilic carboxy terminus are susceptible to protease digestion 
and are thus exposed. The bulk of the carboxy-terminal half of the 
M protein is protease resistant, indicating that the amphipathic part of 
the protein is either folded tightly or embedded in the polar surface 
of the membrane. Indeed, a mutant lacking all three transmembrane 
domains was found to be associated with membranes (Mayer et al., 
1988). The model for the disposition of the M protein in the membrane 
was confirmed for IBV (Cavanagh et al., 1986). Interestingly, the 
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M protein of TGEV was shown to adopt an additional conformation. In 
virions about one-third of the M molecules have their carboxy termi- 
nus exposed on the virus surface rather than buried inside the particle 
(Escors et al., 2001a; Risco et al., 1995). This appears to have immuno- 
logical consequences (Risco e¢ al., 1995) but the real significance of the 
dual topology is unclear. In MHV the M protein was found to assume 
only one defined membrane topology (Raamsman et al., 2000). 

The coronavirus M protein is almost invariably glycosylated in 
its exposed amino-terminal domain. This provides the virion with 
a diffuse, hydrophilic cover on its outer surface. Whereas the group 
1 and 3 coronaviruses and SARS-CoV all contain M proteins with 
only N-linked sugars, the M proteins of group 2 coronaviruses are 
O-glycosylated (for a review see Rottier, 1995). An exception is MHV- 
2, the M protein of which carries both O- and N-linked sugars (Yamada 
et al., 2000). N-Glycosylation is initiated in the endoplasmic reticulum 
by the cotranslational linkage of a large oligosaccharide structure 
to the polypeptide at asparagine residues within the consensus 
sequence NXS/T (where X is any amino acid). In contrast, mucin-type 
O-glycosylation starts posttranslationally with the addition of an 
N-acetylgalactosamine (GalNAc) monosaccharide to a hydroxylamino 
acid. O-Glycosylation is subsequently completed by stepwise addition 
of other monosaccharides such as galactose, N-acetylglucosamine, 
fucose, and sialic acid. MHV M proteins carry a well-conserved 
SS(X)TTXXP sequence at their extreme amino terminus. Despite the 
apparent presence of multiple hydroxylamino acids as potential oligo- 
saccharide acceptor sites the M protein of MHV-A59 was found to be 
modified by the addition of only a single oligosaccharide side chain (de 
Haan et al., 1998b). This side chain, when studied in OST7-1 cells, 
appeared to be attached to the threonine at position 5. Mutation 
studies, however, revealed that alternative acceptor sites can also be 
used. No unique sequence motifs for O-glycosylation of MHV M could 
be identified, which is probably related to the occurrence in cells of 
multiple GalNAc transferases (de Haan et al., 1998b). As the expres- 
sion of these enzymes varies in cells, conservation of the SS(X)TTXXP 
motif in MHV M protein may serve to increase opportunities for the 
protein to become glycosylated in different cell types. 

The distinct conservation of N- and O-glycosylation among the 
M proteins of the different groups of coronaviruses suggests that the 
presence and the particular type of carbohydrates are somehow bene- 
ficial to the virus, most likely in its interaction with the host. Glycosyl- 
ation of the M protein appeared not to be required for envelope 
assembly (de Haan ef al., 1998a,) or for interaction with the S protein 
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(de Haan et al., 1999), nor did it influence virus replication in vitro 
(de Haan e¢ al., 2002a, 2003; Laude et al., 1992). Coronaviruses are able 
to induce interferon « (IFN-«) by their glycoproteins (Baudoux et al., 
1998a,b). For TGEV (Charley and Laude, 1988; Laude et al., 1992) 
and MHV (de Haan et al., 2003), the oligosaccharides linked to the M 
protein were demonstrated to be important for efficient IFN induction 
in vitro. The glycosylation status of the MHV M protein was found to 
influence the ability of the virus to replicate in the liver but not in 
the brain (de Haan e¢ al., 2003). Thus, viruses with N-glycosylated 
M proteins replicated to a significantly higher extent in liver than 
otherwise identical viruses carrying O-glycosylated M proteins. MHV 
with unglycosylated M proteins replicated to the lowest extent. The 
mechanism behind these observations remains to be elucidated. 
When expressed in cells independently from the other viral proteins, 
the M proteins of MHV, IBV, TGEV, and feline coronavirus (FCoV) 
accumulate in the Golgi compartment, that is, beyond the site of virus 
budding (Klumperman e¢ al., 1994; Locker et al., 1992; Machamer and 
Rose, 1987; Machamer et al., 1990; Rottier and Rose, 1987), which is 
the intermediate compartment between the ER and the Golgi 
(ERGIC). The fine localization of the different proteins is, however, 
not the same. For instance, whereas the MHV M protein is concen- 
trated in the trans-most Golgi compartments, the IBV M protein 
localizes to the cis side of the Golgi complex. Signals for localization 
appear to reside in the hydrophilic part of the cytoplasmic tail and in 
the transmembrane domains. The extreme carboxy-terminal tail of 
MHV M was shown to be necessary, although not sufficient, for Golgi 
localization (Armstrong and Patel, 1991; Locker e¢ al., 1994). Mutant 
proteins lacking this domain were transported to the plasma mem- 
brane. Also, mutation of a single tyrosine in this domain, which occurs 
in the context of a potential internalization signal, resulted in plasma 
membrane localization (C. A. M. de Haan and P. J. M. Rottier, unpub- 
lished results). The first transmembrane domain of the IBV M protein 
was shown to be required and sufficient for localization to the cis-Golgi 
region (Machamer and Rose, 1987; Machamer e¢ al., 1990, 1993; Swift 
and Machamer, 1991). This is not the case for the MHV M protein of 
which mutants with only the first transmembrane domain did not 
leave the ER (Armstrong et al., 1990; Locker et al., 1994; Rottier 
et al., 1990). Moreover, insertion of the first transmembrane domain 
of MHV M into a reporter protein resulted in a chimeric protein 
that was transported to the cell surface (Armstrong and Patel, 1991; 
Machamer et al., 1993), unlike a similar chimeric protein containing 
the first transmembrane domain of IBV that was retained in the Golgi 
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compartment (Machamer et al., 1993). Other MHV M mutant proteins 
lacking the first and second transmembrane domains were also 
not efficiently retained in the Golgi compartment and were diverted 
to endosomal structures (Armstrong et al., 1990; Locker e¢ al., 1994). 
The mechanism by which Golgi retention of M proteins is regulated 
has not yet been resolved. However, oligomerization of the proteins, 
mediated by the transmembrane domains, seems to play an important 
role, perhaps in combination with retrieval mechanisms (de Haan 
et al., 2000; Maceyka and Machamer, 1997). Formation of oligomeric 
complexes has been demonstrated to correlate with Golgi retention of a 
reporter protein containing the first transmembrane domain of IBV M 
(Weisz et al., 1993) while also the Golgi-resident MHV M protein was 
found to occur in large, homomeric complexes (Locker et al., 1995). The 
lumenal domain of the M protein does not appear to contribute to 
localization; its deletion from MHV M did not affect the intracellular 
destination of the protein (Mayer et al., 1988; Rottier et al., 1990). 

In infected cells the M proteins of IBV and MHV were observed to 
occur in the membranes of the budding compartment as well as in 
the Golgi compartment. Under these conditions their cis—trans distri- 
bution in the Golgi compartment was the same as when these proteins 
were expressed independently (Klumperman et al., 1994; Machamer 
et al., 1990). 


E. E Protein 


The E protein (previously known as sM protein) is a small protein 
(76-109 residues) and a minor component of the coronaviral envelope. 
Although the primary structures of E proteins are quite conserved 
within the different coronavirus groups, they share little homology 
between the groups. However, the proteins have several structural 
features in common. The E protein contains a relatively large hydro- 
phobic region in its amino-terminal half, followed by a cysteine-rich 
region, an absolutely conserved proline residue, and a hydrophilic tail. 
E is an integral membrane protein, which is assembled in membranes 
without the involvement of a cleaved signal peptide (Raamsman et al., 
2000). Its membrane topology has not been firmly established. 
Although the opposite was proposed initially for the TGEV E protein 
(Godet et al., 1992), there seems to be consensus about the hydrophilic 
carboxy terminus being exposed on the cytoplasmic side in cells or 
on the inside of the virion (Corse and Machamer, 2000; Raamsman 
et al., 2000). The amino terminus of the MHV E protein was not 
detectably present on the virion outside (Raamsman e¢ al., 2000) but 
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appeared to be exposed cytoplasmically when it was extended with an 
amino-terminal epitope tag (Maeda et al., 2001), consistent with a 
topological model in which the hydrophobic domain spans the bilayer 
twice. For the IBV E protein evidence was provided indicating that 
the amino terminus is exposed lumenally in cells, consistent with a 
single spanning topology (Corse and Machamer, 2000) (Fig. 4). 

The E protein is not glycosylated but appears to become palmitoy- 
lated. This was shown most convincingly for the IBV E protein by 
labeling with [°HIpalmitate (Corse and Machamer, 2002), both in 
IBV-infected cells and when the protein was expressed. Mutagenesis 
revealed that one or both of the two conserved cysteines became 
modified. The result is consistent with the observed increase in elec- 
trophoretic mobility of the MHV E protein after treatment with 
hydroxylamine, an agent that cleaves thioester-linked acyl chains 
(Yu et al., 1994). Others, however, were not able to confirm this post- 
translational modification (Godet et al., 1992; Raamsman et al., 2000). 

In coronavirus-infected cells the E protein has been observed by 
immunofluorescence studies to occur at intracellular membranes as 
well as at the cell surface (Godet et al., 1992; Smith et al., 1990; Tung 
et al., 1992; Yu et al., 1994). When expressed exogenously from cDNA, 
the E protein was detected only in intracellular organelles, although at 
different locations. The MHV E protein localized to pre-Golgi mem- 
brane compartments, as was demonstrated by its colocalization with 
rab-1, a marker for the endoplasmic reticulum and the ERGIC, by 
electron microscopy (Raamsman et al., 2000). The IBV E protein, 
tagged at its amino terminus with an epitope, was also localized 
to pre-Golgi compartments (Lim and Liu, 2001). In another study, 
however, the IBV E protein was shown to accumulate in the Golgi 
apparatus, being distributed throughout the complex (Corse and 
Machamer, 2000, 2002). While the former study identified an ER- 
targeting signal in the extreme carboxy terminus of the E protein 
(Lim and Liu, 2001), the latter studies, using carboxy-terminal trunca- 
tions, mapped the Golgi-targeting information to a region between 
tail residues 13 and 63 (Corse and Machamer, 2002). In addition, 
these authors showed the IBV E cytoplasmic tail to be necessary and 
sufficient for Golgi targeting. 

The E protein was identified as a virion component relatively late, 
due to its low abundance and its small size (Godet e¢ al., 1992; Liu 
and Inglis, 1991; Yu e¢ al., 1994). It was estimated to occur in IBV, 
TGEYV, and MHV virions at a rate of about 100, 20, and 10 molecules 
per particle, respectively (Godet et al., 1992; Liu and Inglis, 1991; 
Vennema et al., 1996). Because of its low abundance the E protein 
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may not have a genuine structural function in the virion envelope. 
Rather, it may have a morphogenetic function by taking strategic 
positions within the M protein lattice to generate the required mem- 
brane curvature. Alternatively, it may serve to close the neck of the 
budding particle as it pinches off the membrane (Vennema e¢ al., 
1996). Expression of the E protein alone induced the formation of 
characteristic membrane structures also observed in infected cells, 
which apparently consist of masses of tubular, smooth convoluted 
membranes (David-Ferreira and Manaker, 1965; Raamsman e¢ al., 
2000). In addition, it resulted in the formation of vesicles con- 
taining the E protein, shown to be released from the cells (Corse and 
Machamer, 2000; Maeda et al., 1999). 

MHV infection induces caspase-dependent apoptosis in some, but 
not all, cells. By expressing the viral structural proteins separately 
in cells, the activity could be attributed to the E protein (An e¢ al., 
1999). Apoptosis induction has not been reported for E proteins from 
other coronaviruses. 

Coronavirus E proteins share structural similarities with small 
hydrophobic membrane proteins found in other enveloped viruses. 
Examples are the Vpu protein of HIV-1, the 6K protein of alphaviruses, 
and the M2 protein of influenza virus. These proteins, also known 
as viroporins (Gonzalez and Carrasco, 2003), were demonstrated to 
modify membrane permeability and to help the efficient release of 
progeny virus. 


E. S Protein 


The S protein (previously known as E2) constitutes the spikes, 
the hallmark of coronaviruses under the electron microscope. It is 
the major determinant of host range, tissue tropism, pathogenesis, 
and virulence. It is a relatively large, 1160- to 1452-amino acid-long 
type I glycoprotein with a cleavable N-terminal signal sequence and 
a membrane-anchoring sequence followed by a short hydrophilic car- 
boxy-terminal tail of about 30 residues (Fig. 4). When comparing 
primary sequences, the S protein shows two faces: an amino-terminal 
half with hardly any sequence similarities and a carboxy-terminal 
half in which regions with significant conservation can be observed 
(de Groot et al., 1987a,b; for a review see Cavanagh, 1995), consistent 
with the distinctive functions of these domains (see later). 

The S protein is synthesized as a heavily glycosylated polypeptide as 
demonstrated by the susceptibility of the glycans to endoglycosidases 
and by the dramatic effect of the N-glycosylation inhibitor tunicamycin. 
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The number of potential N-glycosylation sites ranges from 21 
(MHV) to 35 [feline infectious peritonitis virus (FIPV)]. The S protein 
has not been reported to contain O-linked sugars. Cotranslational 
N-glycosylation is an essential requirement for proper folding, oligo- 
merization, and transport of the S protein, as has also been shown 
for other (viral) glycoproteins (Doms et al., 1993). Growth of corona- 
viruses in the presence of tunicamycin resulted in the production of 
spikeless, noninfectious particles (Holmes e¢ al., 1981; Mounir and 
Talbot, 1992; Rottier et al., 1981a; Stern and Sefton, 1982). These 
particles were devoid of S protein, which was found to aggregate in 
the endoplasmic reticulum when glycosylation was inhibited (Delmas 
and Laude, 1990). 

Folding of the S protein is a relatively slow process. Besides the 
addition of oligosaccharides it involves the formation and rearrange- 
ment of many intramolecular disulfide bonds. For the S protein of 
MHV-AS59, the lumenal domain of which contains 42 cysteine residues, 
the major conformational events appear to take about 20 min during 
which the protein passes through a continuous spectrum of folding 
intermediates (Opstelten et al., 1993a). Folding of S is probably the 
rate-limiting step in the process of oligomerization. Sufficiently folded 
S protein monomers associate in the endoplasmic reticulum to form 
trimers (Delmas and Laude, 1990; Lin et al., 2004), with a half-time of 
approximately 1 h (Delmas and Laude, 1990; Vennema et al., 1990a,b). 
Trimerization is likely to be required for export out of the endoplasmic 
reticulum. In infected cells S protein trimers interact with M protein 
(Opstelten et al., 1995) and perhaps also with E protein, and migrate 
to the virus assembly site. A fraction of the S protein is transported to 
the plasma membrane where it can cause cell-cell fusion, a feature 
formally attributed to the S protein by its individual expression in cells 
(de Groot et al., 1989; Pfleiderer et al., 1990). Under such expression 
conditions the bulk of the S protein remains intracellularly (Vennema 
et al., 1990a) in the endoplasmic reticulum (Opstelten e¢ al., 1995). 
Retrieval signals have been identified in the cytoplasmic tail of the S 
proteins from coronavirus groups 1 and 3 as well as in the tail of the 
SARS-CoV S protein, but not in the group 1 MHV S protein (Lontok 
et al., 2004). 

During its transport to the cell surface, either alone or as part of 
virions, the S protein undergoes further modifications. The N-linked 
sugars are modified and become mature during passage through the 
Golgi complex. The MHV S protein was shown to become palmito- 
ylated, a modification that may already take place in the endoplasmic 
reticulum (van Berlo et al., 1987). As a late step the S protein can be 
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cleaved. A basic amino acid sequence resembling the furin consensus 
sequence motif (RXR/KR) occurs approximately in the middle of the 
protein and was shown to be the target of a furin-like enzyme in the 
case of MHV-A59 (de Haan et al., 2004). Cleavage has been demon- 
strated for S proteins from coronavirus groups 2 and 3, but not for 
S proteins from group 1 viruses (Cavanagh, 1995) or from SARS-CoV 
(Bisht et al., 2004). The resulting amino-terminal S1 subunit and the 
membrane-anchored S2 subunit remain noncovalently linked. It 
has been suggested that the S1 subunit constitutes the globular 
head, whereas the S2 subunit forms the stalk-like region of the spike 
(Cavanagh, 1983b; de Groot et al., 1987a,b). 

The coronavirus S protein has two functions, which appear to be 
spatially separated. The S1 subunit (or the equivalent part in viruses 
with uncleaved S protein) is responsible for receptor binding, and the 
$2 subunit is responsible for membrane fusion. For several corona- 
viruses the receptor-binding site in S1 has been mapped. For MHV 
strain JHM (MHV-JHM), for instance, it was located in the domain 
composed of the amino-terminal 330 residues of the S molecule (Kubo 
et al., 1994), residues 62-65 and 214~216 being particularly important 
(Saeki et al., 1997; Suzuki and Taguchi, 1996). This amino-terminal 
domain also determined CEACAM1 receptor specificity of various 
MHV strains (Tsai et al., 2003). For TGEV (Godet et al., 1994), 
HCoV-229E (Bonavia et al., 2003; Breslin et al., 2003), and SARS- 
CoV (Babcock et al., 2004; Wong et al., 2004) the receptor-binding 
domains have also been mapped to the S1 subunit, although in differ- 
ent regions. In several cases neutralizing antibodies were demon- 
strated to bind the receptor-binding domains and to prevent the 
interaction with the receptor (Godet et al., 1994; Kubo et al., 1994; 
Sui et al., 2004). 

The interaction between the S protein and its receptor is the major 
determinant for virus entry and host range restriction. Nonpermissive 
cell lines can be rendered susceptible by making them express 
the receptor (see later references). Coronaviruses can also be retar- 
geted to specific cells by exchanging the ectodomain of the S protein for 
that of an appropriate other coronavirus, as was demonstrated for 
MHV (Kuo e¢ al., 2000) and FIPV (Haijema et al., 2003). Receptors 
have so far been identified for the group 2 coronavirus MHV 
(CEACAM; Dveksler et al., 1991, 1993; Williams et al., 1991); the group 
1 coronaviruses TGEV and porcine respiratory coronavirus (PRCoV) 
(pAPN; Delmas et al., 1992, 1993), FIPV (fAPN; Tresnan et al., 1996), 
and HCoV-229E (hAPN; Yeager et al., 1992); and for SARS-CoV (ACE2; 
Li et al., 2003). The S proteins of the group 2 coronaviruses have been 
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observed to exhibit hemagglutinating activities. Although for bovine 
coronavirus (BCoV), HCoV-0C43, and hemagglutinating encephalo- 
myelitis virus (HEV) 9-O-acetylated sialic acids were identified as a 
receptor determinant (Krempl e¢ al., 1995; Kunkel and Herrler, 
1993; Schultze and Herrler, 1992; Schultze et al., 1991a; Vlasak 
et al., 1988b), specific receptors for these viruses have not been identi- 
fied. Also, the MHV S protein appears to bind sialic acid derivatives 
in addition to its specific receptor CEACAM, which may suggest 
that sialic acids function as an additional receptor determinant for 
MHV-like coronaviruses (Wurzer et al., 2002). 

The ectodomain of the S2 subunit, which is involved in the fusion 
process, contains two heptad repeat (HR) regions (de Groot et al., 
1987a,b), a sequence motif characteristic of coiled coils. Mutations 
in the first (i.e., membrane-distal) HR region of the MHV S protein 
resulted in fusion-negative phenotypes (Luo and Weiss, 1998) or in 
a low-pH dependence for fusion (Gallagher et al., 1991), whereas 
mutations in the second HR region caused defects in S protein oligo- 
merization and fusion ability (Luo et al., 1999). A fusion peptide has 
not yet been identified in any of the coronavirus spike proteins, but 
is predicted to be located at (Bosch et al., 2004b; Chambers e¢ al., 1990) 
or within (Luo and Weiss, 1998) the amino terminus of the first HR 
region. Binding of the S1 subunit to the (soluble) receptor, or exposure 
to 37°C and an elevated pH, has been shown to trigger conformational 
changes that are supposed to facilitate virus entry by activation of 
the fusion function of the S2 subunit (Breslin et al., 2003; Gallagher, 
1997; Lewicki and Gallagher, 2002; Matsuyama and Taguchi, 2002; 
Miura et al., 2004; Sturman e¢ al., 1990; Taguchi and Matsuyama, 
2002; Zelus et al., 2003). This conformational change is thought to lead 
to exposure of the fusion peptide and its interaction with the target 
membrane, further changes resulting in the formation of a heterotri- 
meric six-helix bundle, characteristic of class I viral fusion proteins, 
during the membrane fusion process. Indeed, peptides corresponding 
to the HR regions of MHV (Bosch ez al., 2003; Xu et al., 2004) and 
SARS-CoV (Bosch e¢ al., 2004b; Ingallinella et al., 2004; Liu et al., 
2004; Tripet et al., 2004; Zhu et al., 2004) were found to assemble into 
stable oligomeric complexes in an antiparallel manner, which in the 
natural situation would result in the close colocation of the fusion 
peptide and the transmembrane domain. These peptides were further 
shown to be inhibitors for viral entry (Bosch e¢ al., 2003, 2004b; Liu 
et al., 2004; Yuan et al., 2004; Zhu et al., 2004). 

Besides the HR regions, other parts of the S protein are also likely to 
be important for the fusion process. All coronavirus S proteins contain 
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a highly conserved region (de Groot et al., 1987a), rich in aromatic 
residues, downstream of the second HR region, part of which may form 
the start of the transmembrane domain. The function of this domain 
is unknown, but a similar region in the HIV-1 Env protein was demon- 
strated to be important for viral fusion and Env incorporation 
into virions (Salzwedel et al., 1999). Immediately downstream of the 
transmembrane domain all S proteins contain a cysteine-rich region 
(de Groot et al., 1987a). Using a mutational approach including dele- 
tions, insertions, and substitutions, both the transmembrane domain 
and the cysteine-rich region immediately downstream thereof, but not 
the carboxy-terminal part of the cytoplasmic tail, were shown to be 
important for MHV S protein-induced cell-cell fusion (Bos et al., 1995; 
Chang and Gombold, 2001; Chang et al., 2000) (B. J. Bosch, C. A. M. de 
Haan, and P. J. M. Rottier, unpublished results). 

The cleavage requirements of the S proteins for the biological 
activities of the coronavirus spike remain enigmatic. Whereas the S 
proteins of group 1 coronaviruses, such as FIPV (Vennema et al., 
1990a), are not cleaved, those of other coronaviruses, particularly of 
groups 2 and 3, are cleaved to variable extents, depending on the viral 
strain and the cell type in which the viruses are grown (Frana et al., 
1985; reviewed by Cavanagh, 1995). Cleavage of the S proteins is not 
required to expose the internal fusion peptide. Whereas cleavage of 
the MHV S protein generally correlates strongly with cell-cell fusion 
(Cavanagh, 1995), virus-cell fusion appeared not to be affected by 
preventing S protein cleavage, indicating that these fusion events 
have different requirements (de Haan e¢ al., 2004). Similarly, whereas 
trypsin activation of SARS-CoV S protein was required for cell-cell 
fusion, it did not enhance the infectivity of cell-free pseudovirions 
(Simmons et al., 2004). For MHV-4, the spikes of which are able 
to initiate fusion without prior interaction with the primary MHV 
receptor (Gallagher et al., 1992), the stability of the S1-S2 heterodi- 
mers after S protein cleavage is low, allowing receptor-independent 
fusion. During cell culture adaptation, however, selected mutant 
viruses carried deletions in the S1 subunit, downstream of the recep- 
tor-binding domain, which resulted in stabilized S1-S2 heterodimers 
and receptor-dependent fusion activity (Krueger et al., 2001). 


G. HE Protein 


Virions of group 2 coronaviruses generally contain a fringe of shorter 
surface projections in addition to the characteristic spikes (Bridger 
et al., 1978; King et al., 1985; Sugiyama and Amano, 1981). These 
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viruses express and incorporate into their particles an additional 
membrane protein, HE (for a review see Brian et al., 1995). Although 
all group 2 viruses contain an HE gene, the protein is not expressed by 
all MHV strains (Luytjes e¢ al., 1988; Yokomori et al., 1991), indicating 
that HE is a nonessential protein also in these viruses. 

The HE gene encodes a type I membrane protein of 424439 residues 
that contains a cleavable signal peptide at its amino terminus (Hogue 
et al., 1989; Kienzle et al., 1990) and a transmembrane domain close 
to its carboxy terminus, leaving a short cytoplasmic tail of about 10 
residues (Fig. 4). The ectodomain contains 8-10 putative N-linked 
glycosylation sites. The putative esterase active site (FGDS) is located 
near the (signal-cleaved) HE amino terminus. The coronavirus HE 
protein has 30% amino acid identity with the HE-1 subunit of the 
HE fusion protein of influenza C virus and the HE protein of torovirus 
(Cornelissen et al., 1997). It has been suggested that coronaviruses 
have captured their HE module from influenza C virus or a related 
virus (Luytjes et al., 1988). However, influenza C virus, toroviruses, 
and coronaviruses may well have acquired their HE sequences 
independently, not from each other but from yet another source 
(Cornelissen et al., 1997). The HE protein becomes cotranslationally 
N-glycosylated when expressed in cells, giving rise to a polypeptide of 
approximately 60-65 kDa that rapidly forms disulfide-linked dimers 
(Hogue et al., 1989; Kienzle et al., 1990; King et al., 1985; Parker et al., 
1989; Yokomori et al., 1989; Yoo et al., 1992). The HE dimers (or a 
higher order structure thereof) become incorporated into virions, while 
a proportion is transported to the cell surface (Kienzle et al., 1990; 
Pfleiderer et al., 1991). 

Little is still known about the function(s) of the coronavirus HE 
protein. The protein contains hemagglutinin and acetyl esterase 
activities (Brian et al., 1995). While the HE proteins of BCoV, 
HEV, and HCoV-O0C43 hydrolyze the 9-O-acetyl group of sialic acid 
and therefore appear to function as receptor-destroying enzymes 
(Schultze et al., 1991b; Vlasak e¢ al., 1988a), the HE proteins of MHV- 
like coronaviruses function as sialate-4-O-acetylesterases (Klausegger 
et al., 1999; Regl et al., 1999; Wurzer et al., 2002). Although inhibition 
of the esterase activity of BCoV resulted in a 100- to 400-fold reduction 
in viral infectivity (Vlasak et al., 1988a), it was shown both for BCoV 
and for an MHV strain expressing an HE gene that the S protein is 
required and sufficient for infection (Gagneten et al., 1995; Popova and 
Zhang, 2002). In view of these results it has been proposed that the HE 
protein might play a role at an even earlier step and may mediate viral 
adherence to the intestinal wall through the specific yet reversible 
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binding to mucopolysaccharides. The process of binding to sialic acid 
receptors followed by cleavage and rebinding to intact receptors could 
theoretically result in virus motility and even allow migration through 
the mucus layer covering the epithelial target cells in the respiratory 
and enteric tracts (Cornelissen et al., 1997). 

Several studies have indicated the HE protein to play a role in 
pathogenicity. The HE protein of BCoV (Deregt and Babiuk, 1987), 
but not that of MHV, was able to induce neutralizing antibodies. 
However, passive immunization of mice with nonneutralizing, 
MHV HE-specific antibodies protected the animals against a lethal 
MHV infection (Yokomori et al., 1992). Furthermore, intracerebral 
expression of the HE protein in mice was found to affect the neuro- 
pathogenicity of MHV (Yokomori et al., 1995; Zhang et al., 1998). 
Strikingly, HE protein-defective MHV mutants were rapidly selected 
during viral infection in the mouse brain (Yokomori e¢ al., 1993), which 
may suggest that the HE protein plays a more critical role during the 
infection of other tissues. 


IIL. UnrrasrrucTuraL OBSERVATIONS OF CORONAVIRUS MORPHOGENESIS 


A. Viral Budding 


Early electron microscopic studies demonstrated that coronavirus 
morphogenesis takes place at intracellular membranes and identified 
the cisternae of the endoplasmic reticulum as the site of budding of 
IBV and HCoV-229E (Becker et al., 1967; Chasey and Alexander, 1976; 
Hamre et al., 1967; Oshiro et al., 1971). Later studies revealed that 
early in infection particle formation occurs predominantly at smooth- 
walled, tubulovesicular membranes located intermediately between 
the rough endoplasmic reticulum and the Golgi complex (ERGIC). This 
so-called intermediate compartment was shown to be used as the early 
budding compartment by MHV, IBV, FIPV, TGEV, and SARS-CoV 
(Goldsmith et al., 2004; Klumperman e¢ al., 1994; Tooze et al., 1984). 
At later times during infection the rough endoplasmic reticulum was 
seen to gradually become the major site of MHV budding in fibroblasts 
(Tooze et al., 1984). 

As already mentioned, ultrastructural studies localized the MHV 
and IBV M proteins in the budding compartment(s) but also in the 
Golgi complex, that is, beyond the site of budding (Klumperman e¢ al., 
1994; Tooze et al., 1984). Apparently, accumulation of M protein alone 
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is not sufficient to determine the site of budding; other viral and/or 
cellular factors are required as well. For MHV the E protein, which 
was found to localize to the intermediate compartment by immuno- 
electron microscopy (immuno-EM), was suggested to be such a candi- 
date (Raamsman ef al., 2000), but more players are likely to be 
involved. Whether the helical nucleocapsids, visible as electron-dense 
cytoplasmic elements adjacent to budding profiles (David-Ferreira and 
Manaker, 1965; Dubois-Dalcgq et al., 1982; Massalski et al., 1982; Risco 
et al., 1998; and references given previously), are a determining factor 
is unclear. In this respect, knowledge about the budding location of 
coronavirus-like particles (see later) might be informative. 


B. Postassembly Maturation of Virions 


Coronavirions are subject to an intracellular postbudding matura- 
tion process that occurs while they are on their way through the 
constitutive exocytic pathway by which they are exported out of the 
cell (Risco et al., 1998; Salanueva et al., 1999; Tooze et al., 1987). 
Indications of this had already been noticed in early morphological 
studies with HCoV-229E (Becker et al., 1967; Chasey and Alexander, 
1976; Hamre et al., 1967; Oshiro et al., 1971) and MHV (Holmes and 
Behnke, 1981; Holmes et al., 1981), but were described in somewhat 
more detail for MHV by Tooze and coworkers (1987). The pictures show 
“immature” virions in pre-Golgi compartments and Golgi cisternae 
that appear as spherical structures with the ribonucleoprotein core 
immediately below the viral envelope and with an “empty” center. By 
contrast, virions in the trans-Golgi network and beyond have the 
mature morphology showing a fairly uniform, high internal electron 
density. 

An extensive analysis of the structural maturation of coronavirions 
was reported for TGEV (Risco et al., 1998; Salanueva et al., 1999). 
Budding was shown to yield relatively large virions with an annular, 
electron-dense internal periphery and a clear central area. Smaller 
particles, with the characteristic morphology of extracellular virions, 
that is, having a compact, dense inner core with polygonal contours, 
were seen to accumulate in secretory vesicles in the periphery of the 
infected cell. Both types of particles appeared to coexist in the Golgi 
complex (Fig. 5). Obviously, the larger particles are the precursors 
of the smaller mature virions (Salanueva et al., 1999) and probably 
undergo their morphological maturation during their transport 
through the Golgi complex. The reorganization of the particle gives 
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Fic 5. Structural maturation of coronavirus particles. Two types of virion-related 
particles were detected in TGEV-infected cells. Although large virions with an electron 
dense internal periphery and a clear central area are abundant at perinuclear regions, 
smaller viral particles, with the characteristic morphology of extracellular virions, accu- 
mulate inside secretory vesicles that reach the plasma membrane. (A) Large virions 
(arrowheads) and small dense viral particles (arrows) coexist within the Golgi complex 
of infected cells (Risco et al., 1998). (B) For a direct comparison of size and morphology a 
small, dense particle and a large particle are shown (Salanueva et al., 1999). Pictures 
were kindly provided by C. Risco. 


rise to the supposedly icosahedral core shell and is accompanied by a 
dramatic, approximately 50% reduction of the particle volume. 

It is presently unknown what triggers the morphological reorgani- 
zation in the Golgi complex. Application of drugs affecting the state of 
the organelle did not give clues. As virions encounter an increasingly 
acidic pH on passage through the Golgi stack, studies addressing 
this parameter were done with lysosomotropic agents. Thus, chloro- 
quine and NH,Cl were applied to MHV-infected cells to elevate the pH 
at the trans side of the Golgi complex, but no effect on the maturation 
of MHV was observed (Tooze e¢ al., 1987). Monensin, a drug that 
reversibly disorganizes the Golgi complex and blocks transport along 
the exocytic pathway, led to the accumulation of the large, annular 
TGEV virions; after reversal of the blockade formation of the small, 
compact particles was again restored (Salanueva et al., 1999). These 
observations confirm that the Golgi complex is necessary for TGEV 
structural transformation. Nocodazole treatment of cells causes a 
reversible fragmentation of the Golgi complex. Under these conditions 
TGEV virions were still able to undergo normal structural maturation. 
In contrast, still another Golgi-disrupting compound, brefeldin A, 
prevented their maturation (Risco et al., 1998). This compound leads 
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to a redistribution of Golgi membranes to the ER, leaving no definable 
Golgi system. Interestingly, MHV particles accumulated in infected 
cells under these conditions appeared to be infectious when lib- 
erated by sonication (J. Meertens and P. J. M. Rottier, unpublished 
results), leaving us with an intriguing question about the function of 
the maturation process. 


IV. Motecutar INTERACTIONS IN ASSEMBLY OF THE CORONAVIRION 


A. Nucleocapsid Assembly 
1. Nucleocapsids in Infected Cells and Virions 


Helical nucleocapsids are assembled in the cytoplasm of coronavi- 
rus-infected cells. They have been recognized by their tubular appear- 
ance in electron microscopy studies with several viruses including 
IBV, HCoV-229E, TGEV, and MHV (Becker et al., 1967; Chasey and 
Alexander, 1976; David-Ferreira and Manaker, 1965; Dubois-Daleq 
et al., 1982; Hamre et al., 1967; Massalski et al., 1982; Oshiro et al., 
1971; Risco et al., 1998). Large inclusions of nucleocapsids were seen 
to accumulate late in the infection of cells with HCoV (Caul and 
Egglestone, 1977) and MHV-JHM (Dubois-Dalcq et al., 1982). 

The structure of the nucleocapsid as it occurs in infected cells has not 
been studied in any detail. Ribonucleoprotein particles supposed to 
represent nucleocapsids have been isolated from MHV-infected cells 
and were shown to consist of genomic RNA and N protein (Perlman 
et al., 1986; Robb and Bond, 1979; Spaan e¢ al., 1981). The particles 
sedimented as EDTA-resistant structures of 200—230S in sucrose gra- 
dients. During the active phase of viral replication the majority (90%) 
of the intracellular genome-size RNA was found in these structures 
(Spaan et al., 1981). 

Ultrastructural studies of nucleocapsids derived from purified virion 
preparations have shown quite a variety of helical structures, depend- 
ing on the virus and the experimental conditions used. The overall 
feature, however, was that of a thread-like coil, sometimes appearing 
to be hollow, with a diameter varying between 9 and 16 nm and a 
length ranging from 0.32 jm up to 6 ym (Caul ef al., 1979; Davies 
et al., 1981; Kennedy and Johnson-Lussenburg, 1975; Macnaughton 
and Davies, 1978) 

Biochemical analysis of nucleocapsids prepared by detergent disrup- 
tion of purified coronaviruses revealed the presence of genomic RNA 
and N protein. Interestingly, however, particles obtained by treatment 
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of virions with Nonidet P-40 appeared to be spherical when viewed 
under the electron microscope and, in addition, to contain M protein, 
as was observed with TGEV (Garwes e¢ al., 1976), HEV (Pocock and 
Garwes, 1977), and MHV-JHM (Wege et al., 1979). The presence of 
the M protein was found for MHV-A59 to depend on the preparation 
conditions. Whereas the protein was absent when the virions had been 
disrupted with Nonidet P-40 at 4°C, solubilization at 37°C resulted 
in copurification of M protein with the nucleocapsid (Sturman ez al., 
1980). The higher temperature was found to cause a conformational 
change in the M protein, leading to its aggregation and association 
with the viral RNA in the nucleocapsid. Similarly, nucleocapsid struc- 
tures essentially lacking the M protein were also reported for IBV 
when virions were treated with detergent at low temperature (Davies 
et al., 1981). 

Although there is no direct evidence yet, it seems reasonable to 
assume that the helical nucleocapsids seen accumulating in the cytosol 
of infected cells constitute the reservoir that feeds into the viral bud- 
ding system. The location where these nucleocapsids are assembled 
has not been defined. Their production may take place either free in 
the cytoplasm, where the N protein is synthesized, or, alternatively, 
in association with the membrane-bound structures where genomic 
RNA is produced. The observed colocalization of N protein with the 
replication complexes (Bost et al., 2000, 2001; Denison et al., 1999; 
van der Meer et al., 1999) is consistent with the latter possibility. 
Coronavirus replication appears to occur on double-membrane vesicles 
(Gosert et al., 2002), which utilize components of the cellular autoph- 
agy pathway (Prentice et al., 2004). Whereas early in infection the 
replication complexes were shown to be almost entirely discrete from 
sites of M protein accumulation, at later times of infection helicase and 
N proteins appeared to colocalize with the M protein (Bost e¢ al., 2000, 
2001). It was proposed that the translocation of helicase—N protein 
complexes to sites of virus assembly may serve as a mechanism to 
deliver the newly synthesized RNA and nucleocapsids and to facilitate 
the retention of the M protein in the intermediate compartment. 


2. Packaging Signals 


Encapsidation of genomic RNA into a nucleocapsid is presumably 
initiated by an interaction of the N protein with a specific nucleotide 
sequence, the packaging signal, which is subsequently followed by 
the polymerization of N proteins around the RNA molecule in a non- 
sequence-specific manner. The selective incorporation of genomic 
RNA into virions would predict the packaging signal to be located in 
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sequences unique to this RNA, that is, within the approximately 20-kb 
region comprising the 5’ UTR and open reading frame 1 (ORF1) with 
the exception of the leader sequence. Although the data obtained so far 
support this prediction, no consistent picture has emerged yet. 

The approach generally used to map the packaging signal involved 
the study of helper virus-assisted encapsidation of natural and artifi- 
cially obtained defective RNA genomes. Thus, a 650-nucleotide region 
located at the 3’ end of the pol1b gene was initially identified for MHV 
(van der Most et al., 1991), which was subsequently narrowed to an 
area of 190 nucleotides (Fosmire e¢ al., 1992). Within this area a stable 
stem-loop of 69 nucleotides was predicted. Mutation studies revealed 
that the integrity of this secondary structure was important and that 
the sequence of the packaging signal could be trimmed further to a 
minimum stretch of 61 nucleotides (Fosmire et al., 1992). The signal 
appeared to be sufficient for RNA packaging as its inclusion allowed a 
synthetic subgenomic mRNA of MHV-AS59 to be packaged specifically; 
the encapsidation efficiency of the mRNA was, however, significantly 
lower than that of the defective genomic RNA from which it was 
transcribed (Bos e¢ al., 1997). Even a nonviral RNA was found to be 
packaged into MHV particles when provided with the packaging signal 
(Woo et al., 1997). Buoyant density analysis of the particles revealed 
that the RNA was not assembled separately but copackaged with 
helper virus RNA. 

Studies of the corresponding pol1b region of another group 2 corona- 
virus, BCoV, indicated that within this group the packaging signal 
is structurally and functionally conserved. A 69-nucleotide sequence 
with significant homology (74%) to that of MHV was identified within a 
cloned 291-nucleotide segment sharing 72% homology overall (Cologna 
and Hogue, 2000). When this segment was fused to a noncoronavirus 
reporter gene sequence, the resulting RNA appeared to be packaged 
not only by the homologous helper virus BCoV but also by MHV. 
Conversely, when the MHV packaging signal was fused to the reporter 
gene sequence, the RNA was found to be encapsidated also in the 
context of a BCoV-infected cell (Cologna and Hogue, 2000). 

Mapping studies of packaging signals in the genomes of group 1 and 
group 3 coronaviruses have yielded quite different results. For IBV, 
deletion mutagenesis of a defective RNA led to the conclusion that 
only the sequences in the 5’ UTR and/or a region of the 3’ UTR 
were specifically required for packaging, although parts of the pollb 
sequence, but not any part in particular, also enhanced the efficiency 
(Dalton et al., 2001). Somewhat similar conclusions could be drawn 
from a study with TGEV (Izeta ez al., 1999). By comparing packaging 
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efficiencies of different defective genomes it was inferred that informa- 
tion for packaging was present both at the genomic 5’ end (about 1.0 kb) 
and in parts of pollb. A packaging signal was subsequently mapped to 
a fragment representing the 5’ terminal 649 nucleotides of the genome 
by inserting a series of overlapping segments covering a stretch of 
about 2300 nucleotides from the 5’ end of the genome into an mRNA 
reporter expression construct contained within a defective genome 
(Escors et al., 2003); only the 5’ terminal sequence conferred to the 
mRNA the ability to become packaged by helper virus. 

It is too early to conclude that the apparently contrasting results 
reflect true, fundamental differences in encapsidation strategies 
between the different (groups of ) coronaviruses. Although the overview 
may suggest the existence of different cis-acting signals, the data still 
allow a scenario in which multiple domains in the genome are involved. 
cooperatively, each one contributing differently to (the efficiency of) 
the encapsidation process. Such contributions would not necessarily 
concern N protein binding only; the exceptional complexity of the 
coronaviral genome might call for additional provisions, related per- 
haps to the structuring of the encapsidation complex. Several observa- 
tions indeed imply the involvement of multiple domains. The efficient 
rescue, for instance, of a BCoV defective genome (Drep) that com- 
pletely lacks the putative 69-nucleotide packaging signal entails 
the participation of additional sequence(s) (Chang and Brian, 1996; 
Cologna and Hogue, 2000). Another example is the strongly increased. 
rescue of an otherwise poorly packaged defective TGEV genome (M22) 
due to the presence of about 4.1 kb of sequences derived from the pol1b 
gene (M62) (Izeta et al., 1999). 


3. N-RNA Interactions 


There is no direct evidence yet for the actual functioning of the 
presumed packaging signals in the initiation of nucleocapsid assembly. 
Binding of N protein to these signals, the first step in the process, has 
so far been addressed only for the 69-nucleotide sequence of MHV. 
Specific binding to RNA transcripts containing this sequence was 
indeed demonstrated biochemically with MHV N protein derived from 
infected cells, from virions, and from cells expressing the protein 
(Molenkamp and Spaan, 1997). The binding efficiency, however, 
appeared to be relatively weak as was shown by comparing N protein 
binding to different parts of a packageable defective genome (MIDI-C) 
(Cologna et al., 2000). The highest binding efficiency was observed 
with an RNA transcript representing about 1 kb from the 5’ end of 
polla. Remarkably, not even removal of the packaging signal from 
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MIDI-C RNA affected binding to the N protein as measured in the 
filter-binding assay used. The observations indicate that the domain 
containing the 69-nucleotide sequence does not function as a packag- 
ing signal in the conventional sense, adding further support to the 
notion that the intricacies of coronaviral nucleocapsid assembly are 
complex. 

Apart from the studies mentioned, the occurrence of N-RNA inter- 
actions has been amply documented. This is not surprising as the 
N protein has been implicated in several other processes that involve 
interaction with RNA, such as replication, transcription, and transla- 
tion (Lai and Cavanagh, 1997; see also Section II.C). As the relevance 
of these interactions for viral assembly is generally unclear, a brief 
survey of the available information is included here. In addition, 
an overview of data on the mapping of RNA interactions on the N 
polypeptide is schematically presented in Fig. 6. 

A high-affinity interaction between the N protein and the 5’ leader 
was demonstrated for MHV-A59 (Stohlman e¢ al., 1988). Using an 
RNA overlay protein blot assay and various in vitro RNA transcripts, 
the binding of N protein was localized to a stretch of nucleotides 
(nucleotides 56-65) at the 3! end of the leader (Stohlman et al., 1988). 
The stretch included the pentanucleotide repeat UCUAA now known 
to be critical for transcription. Biochemical analyses of the interaction 
measured a dissociation constant (Ka) of 14nM for bacterially 
expressed MHV N protein to the leader RNA (Nelson et al., 2000). 
Consistent with the presence of a leader at the 5’ end of all viral RNAs, 
an N protein-specific monoclonal antibody coimmunoprecipitated 
genomic RNA as well as the subgenomic RNAs from MHV-infected 
cells (Baric et al., 1988). Similar observations were made for BCoV 
(Cologna et al., 2000). Packaging of subgenomic RNAs has been 
reported for TGEV (Sethna et al., 1991), BCoV (Hofmann et al., 
1990), and IBV (Zhao e¢ al., 1993) but not for MHV, suggesting that 
their incorporation is not mediated by N protein-leader interaction. 
Whereas for TGEV and IBV the relative packaging of subgenomic 
RNAs was found to be inefficient, genomic RNA appearing in virions 
at a more than 10-fold molar excess over any subgenomic RNA species, 
the BCoV subgenomic N and M mRNAs appeared to be packaged as 
abundantly as the genome (Hofmann e¢ al., 1990). A reevaluation for 
TGEV revealed that the detection of subgenomic RNAs in virions was 
related to the purity of virus preparations, indicating that mRNAs 
were not specifically encapsidated (Escors et al., 2003). 

Besides the leader, the N protein has been found to bind to other parts 
of the coronaviral genome. In addition to the binding site in the MHV 
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Fic 6. Structural organization of the coronavirus N protein. Common features and their distribution along the polypeptide 
chain are shown schematically. The hatched box indicates the most conserved part of the N protein, with a high proportion of 
aromatic residues. The N protein contains many basic residues throughout the polypeptide, but with particular clustering in two 
regions (+++). The upstream cluster contains a serine/arginine-rich region (S/R). The carboxy terminus, which contains a high 
proportion of acidic residues, is also indicated (~~). The bars indicate parts of the N protein that have been implicated in N-N 
and N-RNA interaction. Furthermore, the location of the deletion in the MHV-A59 mutant Alb4 (A) is indicated as well as the 
domain where second-site mutations in revertant viruses of Alb4 are mapped. Finally, the parts of the N protein that could not be 
transferred from BCV into the MHV genome are marked. References are included in the figure. 


196 CORNELIS A. M. pe HAAN AND PETER J. M. ROTTIER 


polla gene mentioned previously, a high-efficiency binding site was iden- 
tified by the same authors in the 3’ half of the N gene of both MHV 
and BCoV (Cologna et al., 2000). The IBV N protein was shown to bind 
sequences in the 3’ terminal UTR of the genome (Zhou e¢ al., 1996). 

Coronavirus N proteins do not contain sequence motifs typically 
found in other RNA-binding proteins. They appear to bind RNA both 
nonspecifically (Masters, 1992; Robbins et al., 1986) and in a sequence- 
specific way (Cologna et al., 2000; Nelson and Stohlman, 1993; Nelson 
et al., 2000; Stohlman e¢ al., 1992). Non-sequence-specific RNA binding 
has been mapped to a large central domain of the MHV N molecule 
(Fig. 6) (Masters, 1992). Also, the leader-binding property was as- 
signed to this domain; this activity was initially mapped to the area 
containing the two highly basic regions (Nelson and Stohlman, 1993), 
but was later narrowed to a 55-residue segment containing the serine/ 
arginine-rich basic region (Nelson e¢ al., 2000). Interestingly, this 
particular region could not be interchanged with its BCoV counterpart 
in a study on the functional equivalence of the N proteins from these 
related viruses (Peng et al., 1995b). Another domain in the MHV 
N protein implicated in viral RNA binding was mapped to an area that 
partly overlaps with the second basic region. The assignment was 
based on an analysis of second-site revertants of MHV mutant Alb4, 
the virions of which are extremely thermolabile because of a 29-residue 
deletion located between the central and carboxy-terminal domain of 
the N protein (Koetzner et al., 1992). The reverting mutations corre- 
lated with restoration of the disturbed RNA-binding capacity of the 
MHV N protein and were found clustered close to the basic region some 
80 residues on the amino side of the deletion (Fig. 6) (Peng et al., 
1995a). Although all these studies consistently attribute a major role 
in RNA binding to the central portion of the coronaviral N protein, the 
interaction of the IBV N protein with the 3’ UTR of IBV RNA mentio- 
ned previously was mapped to the amino- and carboxy-terminal do- 
mains of the molecule (Zhou and Collisson, 2000). 3’ UTR RNA-binding 
activity was also assigned to the amino-terminal domain of the 
SARS-CoV N protein on the basis of studies using nuclear magnetic 
resonance spectroscopy (Huang et al., 2004). 


4. N-N Interactions 


It is obvious that the wrapping of the 30-kb coronaviral genome 
into the compact helical nucleocapsid is largely driven by N protein 
interactions. As there are no indications for packaging of the RNA into 
a preformed capsid, these interactions can be described by the follow- 
ing model. Packaging is initiated by binding of the N protein, either 
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as a monomer or in a multimeric form, to the RNA. By analogy to 
other RNA viruses, this sequence-specific interaction may induce a 
conformational change in the N protein, thereby creating a nucleation 
site for the cooperative stacking of N protein units along the entire 
length of the RNA, now in a non-sequence-specific way. These N units 
can again be monomeric or consist of defined multimers. Finally, 
helix formation is driven by interactions between N molecules sepa- 
rated along the ribonucleoprotein chain but that become adjacent 
in neighboring helices. This model predicts multiple nonequivalent 
interactions between N molecules. 

N-N interactions have been experimentally demonstrated for 
MHV, BCoV, and HCoV-0C43. High molecular weight species of the 
N protein, possibly trimers, were detected by sodium dodecyl sulfate— 
polyacrylamide gel electrophoresis (SDS-PAGE) of virion preparations 
under nonreducing (but not under reducing) conditions, which is 
indicative of intermolecular disulfide bonds between the N subunits 
(Hogue et al., 1984; Narayanan et al., 2003b; Robbins e¢ al., 1986). 
These complexes are likely to be additionally stabilized by noncovalent 
interactions as coronavirus N protein cysteines are not well conserved, 
the SARS-CoV N protein even lacking any cysteine residues. Both 
monomeric and oligomeric N species were able to bind RNA (Robbins 
et al., 1986). Multimeric forms of the N protein were also found in 
association with intracellular genomic RNA in MHV-infected cells as 
shown after the selective isolation of this ribonucleoprotein through 
coimmunoprecipitation with the M protein (Narayanan ez al., 2003b). 
High molecular weight forms of the N protein corresponding to dimers 
and trimers were also demonstrated in vitro after ultraviolet (UV) 
cross-linking of BCoV N protein to RNAs (Cologna et al., 2000). 

Few studies have addressed the identification of the N-N interaction 
domains. The results so far are inconsistent (Fig. 6), but this might as 
well reflect the predicted occurrence of nonequivalent interactions. 
Interaction sites were mapped to the amino-terminal part of the 
MHV N protein (Wang and Zhang, 1999). Using an in vitro binding 
assay in which the full-length N protein was incubated with bacterially 
expressed fusion proteins containing different segments of the N 
protein, interaction was observed with a polypeptide derived from 
the amino-terminal one-third (residues 1-162) of the protein and with 
a polypeptide representing the central part (residues 163-292). The 
latter domain contains the serine/arginine-rich region implicated in 
the binding to the leader/TRS-specific sequences (Nelson ef al., 
2000). This domain could not be replaced by the corresponding domain 
from BCoV without loss of viral viability, from which the authors 
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indeed inferred an involvement in protein-protein interactions (Peng 
et al., 1995b). The same domain was also shown to be essential for the 
homotypic association of the SARS-CoV N protein (He e¢ al., 2004). 
Using a mammalian two-hybrid approach, the N-—N interaction 
appeared to be abolished completely when the serine/arginine-rich 
region had been deleted. However, in another study using the yeast 
two-hybrid system this interaction was not confirmed. A polypeptide 
consisting of the amino-terminal two-thirds of the SARS-CoV N 
protein, that contains the serine/arginine-rich region, exhibited no 
association with the full-length protein (Surjit et al., 2004). Rather, 
self-association was attributed to the carboxy-terminal 209 residues of 
the molecule, which lacks the motif. 


B. Envelope Assembly 
1. Formation of Virus-Like Particles: M-E Interactions 


Unlike most other enveloped viruses, coronaviruses have the 
remarkable feature of being able to independently assemble their 
envelope. Indications for this were already noticed in early electron 
microscopy studies of viral preparations and infected cells showing the 
occurrence of apparently “empty” particles (Afzelius, 1994; Chasey and 
Alexander, 1976; Macnaughton and Davies, 1980). Incomplete virions 
with the typical coronavirus morphology but lacking the N protein and 
the genome could indeed be separated from normal IBV particles by 
their lower density in sucrose gradients (Macnaughton and Davies, 
1980). The definition of virus-like particles (VLPs) and the require- 
ments for their formation were established by the coexpression of the 
coronaviral structural proteins in mammalian cells (Vennema e¢ al., 
1996). Membranous particles were assembled when the MHV envelope 
proteins M, E, and S were coexpressed, without the need for an N 
protein or genomic RNA. The particles were released from the cells 
and, when examined under the electron microscope, appeared to be 
morphologically indistinguishable from authentic virions, that is, they 
had the characteristic shape and dimensions of normal virions. Also, 
their membrane protein composition was similar to that of MHV, with 
a high abundance of M protein and only trace amounts of E protein. 
Quite surprisingly, only the M and E proteins were required for parti- 
cle assembly. Both S and N proteins were dispensable for particle 
formation but, whereas the S protein became incorporated when pres- 
ent, this was not the case for the N protein (Vennema et al., 1996), 
except in combination with (defective) genomic RNA, in which case a 
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nucleocapsid was coassembled (Bos e¢ al., 1996; Kim e¢ al., 1997). 
Individual expression of the M or the E protein in cells did not give 
rise to formation of VLPs although E protein synthesis by itself led to 
the secretion of E-containing vesicles (Corse and Machamer, 2000; 
Maeda e¢ al., 1999). The nature of these particles has not been char- 
acterized in much detail; the vesicles induced by MHV E sedimented 
slightly slower than virions (Maeda et al., 1999) whereas the particles 
obtained with IBV E had about the same density as virions (Corse and 
Machamer, 2000). 

Besides for MHV and IBV, VLPs have so far been described for 
BCoV (Baudoux et al., 1998b), FIPV (Godeke et al., 2000), and TGEV 
(Baudoux et al., 1998b). The observations demonstrate the unique 
budding mechanism of coronaviruses, which is dependent solely on 
the envelope proteins M and E but independent of a nucleocapsid. 
Somewhat similar observations have been described for the flavivirus 
tick-borne encephalitis virus and for hepatitis B virus, which also 
produce proteolipid particles on expression of their envelope proteins 
preM and E (Allison e¢ al., 1995; Mason et al., 1991) and S (Patzer 
et al., 1986; Simon et al., 1988), respectively, but these particles are 
much smaller than the corresponding virions. In contrast, particles 
with the typical, large size of coronaviruses are acquired by the con- 
certed action of just the proteins M and E (Baudoux et al., 1998b; 
Vennema et al., 1996). Budding of enveloped viruses generally requires 
a nucleocapsid (for a review see Garoff et al., 1998). For retroviruses 
the Gag protein, the precursor to the nucleocapsid, is all that is needed 
to obtain particles resembling immature virions; the Env protein is 
dispensable. Budding of alphaviruses, on the other hand, requires both 
the envelope proteins and the nucleocapsid. Interestingly, the same 
appears to hold true for arteriviruses (R. Wieringa, A. A. F. de Vries, 
and P. J. M. Rottier, unpublished observations), which are closely 
related to the coronaviruses, share with them a triple-spanning enve- 
lope protein, and bud into early membranes of the secretory pathway, 
like coronaviruses but unlike alphaviruses. 

It is unknown how the coronavirus M and E proteins cooperate in 
budding. As the extensive electron microscopy work with M proteins 
from various coronaviruses gave no indications that this protein causes 
membrane bulging by itself, it is believed that the function of the E 
protein in coronavirus budding is in the induction of curvature in the M 
protein lattice (see later) and the subsequent budding of the membrane 
(Vennema et al., 1996). By its low abundance in the virion, the E protein 
does not seem to serve a genuine structural function in that it occupies 
frequent, regular positions in the M protein framework. Consistent 
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with an important role of the E protein in particle morphogenesis, 
mutations in its hydrophilic carboxy-terminal part, introduced by 
targeted recombination into the MHV genome, yielded thermolabile 
viruses one of which showed aberrant virion morphology with pinched 
and elongated shapes when viewed in the electron microscope (Fischer 
et al., 1998). Revertant analyses revealed that a single second-site 
amino acid change within the E protein was able to reverse the pheno- 
typic effect of the original mutations, providing support for possible 
interactions between E protein monomers during budding (Fischer 
et al., 1998). Unexpectedly, complete deletion of the E gene from 
the coronaviral genome does not abolish virion formation, demonstrat- 
ing that the protein is not essential for budding. Whereas this deletion 
dramatically (at least 1000-fold) reduced the release of infectivity 
from infected cells in the case of MHV-A59 (Kuo and Masters, 2003), 
knockout of the TGEV E gene resulted in a lethal phenotype (Curtis 
et al., 2002; Ortego et al., 2002). Remarkably, however, in the latter 
case virions still assembled but these appeared to be unable to leave 
the cells (J. Ortego and L. Enjuanes, personal communication). 

The VLP system offers a convenient assay to study many aspects of 
coronavirus envelope assembly. It was thus used to analyze the prima- 
ry structure requirements of the M and E proteins for particle forma- 
tion. For the M protein such studies demonstrated each of its different 
domains to be important. In general, mutations (deletions, insertions, 
and point mutations) in the lumenal domain, the transmembrane 
domains, the amphiphilic domain, or the carboxy-terminal domain of 
the MHV M protein strongly affected its ability to form VLPs (de Haan 
et al., 1998a). The assembly process was particularly sensitive to 
changes in the carboxy terminus of the protein. Truncation by only 
one residue reduced the efficiency severely whereas removal of two 
residues fully abolished particle formation. These effects appeared to 
be less severe in the context of a normal coronaviral infection, probably 
because additional interactions can compensate. The single-residue 
deletion, when introduced into the MHV genome, was without mea- 
surable phenotype and also a mutant virus with a truncation of two 
residues could be obtained, although with difficulty, as it was severely 
affected in its growth (de Haan et al., 1998a; Kuo and Masters, 2002). 
The importance of the M protein cytoplasmic and transmembrane 
domains was confirmed by VLP studies in the IBV system; mutant 
proteins lacking portions of either of these domains were unable to 
support particle assembly (Corse and Machamer, 2003). 

Studies of the primary structure requirements of the E protein for 
VLP formation revealed that the sequence of its hydrophobic domain 
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was not critical. The assembly capacity of the protein was maintained 
when its transmembrane region was partly or completely replaced by 
the corresponding domain of the vesicular stomatitis virus (VSV) G 
protein (Corse and Machamer, 2003). However, when its small amino- 
terminal ectodomain was additionally replaced by the large VSV G 
protein counterpart, the chimeric protein became nonfunctional. Dele- 
tion of the transmembrane, but not the amino-terminal, domain 
rendered the E protein essentially assembly incompetent (Lim and 
Liu, 2001). Deletions in the cytoplasmic carboxy-terminal half of the 
E protein mapped the cysteine-rich region as the most important part 
for VLP assembly (Lim and Liu, 2001). 

Although the interaction between M and E proteins is amply demon- 
strated by their interdependence for VLP formation, direct evidence 
for their interaction has actually been provided only for the IBV 
proteins (Corse and Machamer, 2003; Lim and Liu, 2001). The two 
proteins could be cross-linked to each other in IBV-infected cells and in 
cells coexpressing the M and E genes (Corse and Machamer, 2003). It 
appeared that the cytoplasmic tails of both proteins were required, 
suggesting they are involved in the interaction. In another study 
M-E interaction was demonstrated by a coimmunoprecipitation assay. 
Also in this assay the cytoplasmic domain of the E protein, comprising 
the cysteine-rich region, was found to be important as its deletion 
affected M-E interaction to the greatest extent when compared with 
other deletion mutant E proteins (Lim and Liu, 2001). The results 
from both studies also showed that the ability of mutant E or M 
proteins to interact did not correlate with their assembly competence. 
Apparently, other requirements such as homotypic E or M interactions 
or interactions with host cell components must be met. 

The specificity of the interaction between the M and E proteins 
during particle assembly was further demonstrated by the poorly 
successful attempts to generate chimeric VLPs. No particles were 
observed when heterologous combinations of TGEV and BCoV M and 
E proteins were coexpressed (Baudoux et al., 1998b) and the same was 
true for heterologous combinations of FIPV and MHV M and E proteins 
(H. Vennema and P. J. M. Rottier, unpublished results). In both studies 
chimeric M and E proteins were also tested, demonstrating that, 
except in one case, exchanges between corresponding domains ren- 
dered the proteins assembly incompetent. Only when the TGEV M 
protein amino-terminal ectodomain was replaced with that of BCoV 
did the chimeric polypeptide support VLP formation in combination 
with the TGEV E protein; VLP formations was also supported, but to 
different extents, with TGEV/BCoV chimeric E proteins and—poorly, 
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however—with the BCoV E protein (Baudoux et al., 1998b). The recip- 
rocal M construct, the BCoV M protein carrying the TGEV ectodomain, 
was nonfunctional, even in combination with TGEV E protein. Consis- 
tently, replacement of the ectodomain with that of FIPV M also 
abolished the productive partnership of MHV M protein with MHV E 
(de Haan et al., 1999). 


2. M-M Interactions 


As the disproportionate amounts of M and E proteins in VLPs 
already imply, homotypic interactions between M molecules must con- 
stitute the energetic basis underlying the formation of the coronaviral 
envelope. In MHV-based VLPs generated by coexpression of M and E 
proteins, for instance, the sheer excess of M protein—the relative 
molar presence of E in the particles is less than 1%—is evidence for 
the strong interactive forces between M molecules. Hence, envelope 
assembly is thought to be driven primarily by laterally interacting 
M molecules that form a two-dimensional lattice in intracellular mem- 
branes (Opstelten et al., 1993b, 1995). Large multimeric complexes of 
M protein have indeed been demonstrated biochemically after individ- 
ual expression of the MHV protein in cells. When the association of the 
M molecules was maintained by the careful selection of cell lysis con- 
ditions, sucrose gradient analysis revealed the existence of large 
heterogeneous (up to about 40 molecules) complexes, which accumu- 
lated in the Golgi compartment (Locker e¢ al., 1995). Somewhat small- 
er complexes were obtained when the cytoplasmic tail of the protein 
was removed; these complexes were no longer retained in the Golgi 
apparatus but transported to the cell surface. Apparently, the tail 
domain is not essential for the lateral interactions between M proteins, 
but it is critically required for budding (de Haan et al., 1998a, 2000). 
Similar higher order complexes of the M protein have also been demon- 
strated in MHV-infected cells as well as in MHV virions (Opstelten 
et al., 1993b, 1995). 

Further support for the existence of homotypic M protein interac- 
tions and additional insight into the domains involved in these inter- 
actions came from work with mutant M proteins that are unable to 
assemble into VLPs. In these studies MHV M proteins with deletions 
in either the transmembrane regions, the amphipathic domain, or 
the extreme carboxy terminus or with substitutions of the lumenal 
domain were tested for their ability to associate with other M proteins 
and to be rescued into VLPs formed by assembly-competent M 
proteins (de Haan e¢ al., 1998a, 2000). It appeared that the mutant 
proteins maintained these biological activities despite the often severe 
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alterations; actually, the only mutant protein that had lost these abil- 
ities was one in which all three transmembrane domains had been 
replaced by a heterologous transmembrane domain. It was concluded 
that M protein molecules interact with each other through multiple 
contact sites, particularly at the transmembrane level. It was further- 
more hypothesized that the full complement of interactions between 
the M molecules is required for efficient particle formation; possibly, all 
these interactions are required to provide the free energy to generate 
and stabilize the budding envelope. The failure of M protein mutants 
capable of associating with assembly-competent M protein to assemble 
into VLPs by themselves (de Haan et al., 1998a, 2000) indicates that 
additional interactions with viral (E) and/or host proteins is required. 
In this respect it is of note that the IFN-inducing capacity of the 
M protein, demonstrated for TGEV and BCoV, also requires the pres- 
ence of the E protein, which suggests that the induction of IFN is 
dependent on a specific, probably regularly organized structure of the 
M protein (Baudoux et al., 1998a,b). It is unclear how the presence of 
the E protein alters the M protein lattice to achieve this effect. 


3. M-S and M-HE Interactions 


Coronavirus envelope assembly is not dependent on the S protein or 
the HE protein. This is obvious from work with VLPs as well as with 
viruses showing that bona fide particles were produced when these 
proteins were either simply absent or unavailable for assembly. Avail- 
ability can be compromised under conditions in which proper folding of 
the proteins is affected. Inhibition of N-glycosylation by the drug 
tunicamycin, for instance, can lead to aggregation and retention of 
membrane proteins in the endoplasmic reticulum and has been shown 
to prevent the incorporation into virions of both the S protein (Holmes 
et al., 1981; Mounir and Talbot, 1992; Rottier et al., 1981a; Stern and 
Sefton, 1982) and the HE protein (Mounir and Talbot, 1992). The same 
effect has been observed with temperature-sensitive MHV mutants 
carrying defects in their S gene, which, when grown at the restrictive 
temperature, gave rise to spikeless particles (Luytjes et al., 1997; 
Ricard et al., 1995). 

Both S and HE proteins are assembled into the coronaviral envelope 
through interactions with the M protein. Such interactions have been 
demonstrated for MHV and BCoV M and S proteins and for BCoV M 
and HE proteins, in infected cells, in cells coexpressing the proteins, 
and in virions (Nguyen and Hogue, 1997; Opstelten et al., 1995). Com- 
plexes of the proteins were shown by coimmunoprecipitation and 
cosedimentation analyses as well as by immunofluorescence studies 
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in which the intracellular transport of S and HE proteins to the plasma 
membrane was found to be inhibited by coexpressed M protein, the 
proteins being retained in the Golgi apparatus, the natural residence 
of the M protein. 

The kinetics with which the proteins engage in heteromeric complex 
formation appeared to be different for the different proteins. This 
effect is due to their different rates of folding and oligomerization. 
For the S protein these rates are low, involving the formation of 
multiple intramolecular disulfide bonds and the addition of numerous 
oligosaccharide side chains (Delmas and Laude, 1990; Opstelten et al., 
1993a; Vennema et al., 1990a,b). In contrast, folding of the MHV and 
BCoV M proteins is independent of disulfide bonds and glycosylation, 
as a result of which they are, for instance, swiftly transported out of 
the endoplasmic reticulum (Opstelten et al., 1993a). As a consequence, 
M molecules enter into M—S and M—HE complexes immediately after 
their synthesis whereas for newly synthesized S and HE molecules it 
took 15-30 min before they started to appear in these heterocomplexes 
(Nguyen and Hogue, 1997; Opstelten et al., 1995). The importance of 
folding as a major rate-limiting step was illustrated by the inability 
of the S protein to interact with M protein when its folding had been 
inhibited by in vivo reduction; only completely oxidized S molecules 
were association competent (Opstelten et al., 1993a, 1995). Whether 
the M protein interacts with S and HE proteins while they are still in 
their monomeric form or only after their oligomerization remains to be 
elucidated. It is, however, clear that the proteins engage in interaction 
with each other in early compartments, most likely the endoplasmic 
reticulum, as judged from the oligosaccharide maturation states of 
freshly formed protein complexes (de Haan et al., 1999; Nguyen and 
Hogue, 1997; Opstelten et al., 1995). Only dimers of HE were associated 
with HE-M-S complexes that were observed in BCoV-infected cells; 
because the appearance of HE in these complexes correlated with the 
kinetics of HE dimerization it was concluded that proper oligomeriza- 
tion is most likely a requirement for its association (Nguyen and Hogue, 
1997). Interestingly, such heterotrimeric complexes were not observed 
on coexpression of the three proteins in cells. Under these conditions 
only the heterodimeric M-S and M-HE associations were detected. 

The structural domains of M and § proteins that are involved in 
the formation and stabilization of their complex have been identified. 
Using the coimmunoprecipitation and colocalization assays referred to 
previously, the essential domains in the MHV M protein were mapped 
by a mutagenetic approach (de Haan ef al., 1999). It appeared that 
M-S complex formation was sensitive to changes in all membrane- 
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associated parts of the M molecule. Interactions between M and S 
proteins were found to occur at the level of the transmembrane 
domains and of the amphipathic domain, which is located on the 
cytoplasmic face of cellular membranes. In contrast, neither the lu- 
menally exposed amino terminus nor the hydrophilic cytoplasmic tail 
of the M protein was required; even the deletion of these parts—known 
to abrogate the ability of the protein to form VLPs—did not prevent 
association with the S protein. 

Chimeric S proteins were used to show that the large ectodomains of 
the spikes are not involved in interaction with M proteins. Such 
chimeric proteins were constructed from the MHV and FIPV S proteins 
and consisted of the ecto- or lumenal domain from the one and the 
transmembrane plus endodomain from the other. These proteins, 
which seemed biologically fit as they were still fusion active, were 
initially tested in coexpression studies with the M and E proteins from 
either virus for their ability to be incorporated into VLPs. They were 
found to assemble only into viral particles of the species from which 
their carboxy-terminal domain originated (Godeke et al., 2000). The 
chimeric S genes were subsequently incorporated into the proper coro- 
navirus genomic background, creating the chimeric viruses [{MHV and 
mF'IPV, the spike ectodomains of which are from the feline and murine 
coronavirus, respectively; these studies provided the basis for the 
development of a novel targeted recombination system for reverse 
genetics of coronaviruses (Haijema et al., 2003; Kuo et al., 2000). 

Further fine mapping of the carboxy-terminal parts of the S protein 
involved in M-S protein interaction revealed the importance of the 
cytoplasmic tail. Again using coimmunoprecipitation and VLP incor- 
poration assays, it appeared that increasing truncations gradually 
abolished the association with the M protein (B. J. Bosch, C. A. M. 
de Haan, and P. J. M. Rottier, unpublished results). The significance of 
the tail domain was demonstrated most convincingly by showing the 
coimmunoprecipitation and VLP assembly of a chimeric VSV G protein 
the cytoplasmic tail of which had been replaced by that of MHV S. Tail 
truncations were tolerated in the context of the coronavirus; recombi- 
nant MHVs were generated that lacked 12 or 25 (but not 35) residues 
from the S protein carboxy terminus, but their growth was impaired. 
by about 10- and 10*-fold, respectively. Also, tail extensions were 
tolerated, allowing the construction of a recombinant MHV with a 
spike protein extended at its carboxy terminus by the green fluorescent 
protein (GFP), yielding green fluorescent virions (Bosch et al., 2004a). 
The extension was, however, lost quite rapidly on serial passaging of 
the virus. 
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Molecular details of the interaction of M and HE proteins and the 
requirements of HE for incorporation into viral particles have not been 
described. One study reported that HE protein mutants lacking part of 
their ectodomain were not assembled into particles (Liao et al., 1995). 
Most likely, however, this observation was due to folding or maturation 
defects of the mutant proteins. In another study the BCoV S and HE 
proteins were shown to be incorporated into MHV particles when 
coexpressed in MHV-infected cells. Apparently, homology between 
the proteins of these related group 2 coronaviruses is sufficiently high 
for heterologous M-S and M-HE interactions to occur (Popova and 
Zhang, 2002). 


C. Virion Assembly 


1. Envelope-Nucleocapsid Interactions 


Numerous electron microscopy studies have pictured the process of 
virion assembly in the coronavirus-infected cell. They show the close 
apposition of—presumably preassembled—tubular nucleocapsids to 
intracellular membranes, the appearance of membrane curvature 
at the contact sites, the “growth” of these buds into particle-sized 
vesicles, and the ultimate detachment of virions from the membranes 
by pinching off. 

It has become clear that the M protein is the central player, which, 
through its interactions with every known component of the virion, 
orchestrates the entire assembly process (see Fig. 7). In the process 
two levels of interaction can be distinguished. One is the level of the 
membrane where, as detailed in the previous section, the M protein 
interacts (1) with itself, to generate the basic molecular framework 
of the viral envelope, (2) with the E protein, to induce curving and 
budding of the M protein-modified membrane, and (3) with S and HE, 
to coassemble these spikes into the viral envelope. The other level at 
which the M protein operates involves the incorporation of the nucleo- 
capsid into the virion. Here, two types of interactions have been 
described: interactions of the M protein with the N protein and with 
the viral genome. 

An instrumental role of the M protein in drawing the nucleocapsid 
into the budding particle is indicated by their demonstrated interac- 
tion in studies with virion preparations. The M protein has been shown 
to remain associated with subviral particles obtained after treatment 
of virions with detergent that removes the spikes (Escors et al., 2001a,b; 
Garwes et al., 1976; Lancer and Howard, 1980; Wege et al., 1979). 
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Fic 7. The various domains of the MHV M protein and the processes for which they 
are important. The amphipathic domain of the M protein is represented by an oval. See 
text for references. 


The association was shown for MHV to be temperature dependent 
(Sturman et al., 1980). In the case of TGEV the association of M with 
the spherical structure, termed the core, was stabilized by basic pH and 
divalent cations but lost at high salt concentration, resulting in disrup- 
tion of the core structure and release of the helical nucleocapsid (Escors 
et al., 2001a,b; Risco et al., 1996). In an elegant study of the binding of in 
vitro-translated M polypeptides to purified nucleocapsids the ionic in- 
teraction was mapped to a 16-residue sequence in the hydrophilic car- 
boxy-terminal tail domain (Escors e¢ al., 2001b). Also in infected cells, 
interaction of the M protein with ribonucleoprotein structures, presum- 
ably nucleocapsids, has been demonstrated. Using M-specific antibo- 
dies, structures containing both N protein and genomic RNA were 
coimmunoprecipitated with M protein from MHV-infected cell lysates 
(Narayanan et al., 2000). Conversely, M protein was coprecipitated 
when an N-specific antibody was used, while in this case all viral 
mRNAs copurified because of their known leader-mediated affinity for 
the N protein. These interactions did not require an S or an E protein 
(Narayanan et al., 2000). 

Although interactions between the M and N proteins might intui- 
tively be expected to drive the process of attachment of the nucleocap- 
sid to the intracellular target membrane, direct experimental evidence 
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for this interaction is strikingly lacking. Significantly, MHV M and N 
proteins coexpressed in cells were found not to interact (Narayanan 
et al., 2000) nor did purified TGEV N protein interact with in vitro- 
synthesized M protein (Escors et al., 2001b). Because the N protein 
occurs in coronavirus-infected cells in various configurations—as a 
free protein and in association with an array of partners including 
the viral genome—it is obvious that a selection mechanism must act 
to ensure that only nucleocapsids are assembled into particles. Consis- 
tently, coexpressed N protein is not incorporated into VLPs, but its 
inclusion depends on the presence of viral RNA (Bos e¢ al., 1996; 
Vennema e¢ al., 1996). Thus, unless the selection process is performed 
by a mechanism not involving the N protein, its association with 
genomic RNA to form a nucleocapsid seems required to generate the 
unique conformation that enables it to interact with the M protein. The 
only evidence to date for an interaction between M and N proteins is 
indirect and comes from genetic studies. Analysis of second-site rever- 
tants of a constructed MHV-A59 mutant virus lacking the two carboxy- 
terminal residues of its M protein revealed that the highly defective 
growth phenotype of this virus could be restored, among others, by 
mutations in the carboxy-terminal domain of the N protein (Kuo and 
Masters, 2002). In two independently obtained revertants the N pro- 
tein had lost 15 residues of this—among different strains of MHV 
highly conserved—domain because of a frameshifting 10-nucleotide 
deletion. The results argue strongly for a direct cooperation of the 
carboxy-terminal regions of the M and N proteins during virion forma- 
tion. Other indications supporting the occurrence of M-N interactions 
come from studies of complexes of M protein with ribonucleoprotein 
from MHV-infected cells and with TGEV cores (Escors et al., 2001b; 
Narayanan et al., 2000). When such complexes were treated with 
RNase the association of M and N proteins was not destroyed, suggest- 
ing a direct interaction. However, the presence of short RNAs inacces- 
sible to the RNase but sufficient to bridge the M-N interaction could 
not be excluded. 

The most unusual interaction that the coronaviral M protein seems 
to engage in involves genomic RNA. This interaction has so far been 
reported only for MHV, by Makino and co-workers. These workers had 
shown earlier that the 69-nucleotide packaging signal located in the 
pollb gene could mediate the incorporation into virions of RNAs of 
even noncoronaviral origin (Woo et al., 1997). They subsequently 
showed that this incorporation is most likely effected by a direct and 
specific interaction of the signal with the M protein. When defective 
genomic RNAs or nonviral RNAs were introduced into helper MHV- 
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infected cells, they could subsequently be isolated as ribonucleo- 
proteins from lysates of the cells by immunoprecipitation with an 
M-specific antibody, but only if the RNAs contained the packaging 
signal (Narayanan and Makino, 2001). Coexpression experiments us- 
ing noncoronaviral vectors showed the interaction to be independent of 
the N protein. A reporter gene transcript generated in cells expressing 
the M protein could be coimmunoprecipitated with an anti-M monoclo- 
nal antibody provided that the RNA carried the packaging signal 
(Narayanan et al., 2003a). Moreover, when the E protein was addition- 
ally coexpressed, the signal-containing RNA—but not an identical 
RNA lacking this sequence—was found to be coincorporated into VLPs, 
irrespective of the presence of the N protein. Altogether these observa- 
tions reveal a hitherto unknown type of interaction between a viral 
envelope protein and genomic RNA. Although its significance remains 
to be further established the M-RNA interaction seems to provide 
additional selectivity to the assembly of the coronaviral nucleocapsid. 


2. Specificity and Flexibility 


Assembly of viruses is a process of generally high specificity. Direct- 
ed by specific targeting signals, the viral structural components colo- 
calize to distinct places in the cell where unique and complex 
molecular interactions control their assembly. These rules hold partic- 
ularly for naked viruses and many of the smaller enveloped viruses; 
there are, however, many examples where the process is considerably 
less selective and where “nonself” (host or viral) components are coas- 
sembled (see, e.g., Garoff et al., 1998). Interestingly, formation of the 
large, pleiomorphic coronaviruses appears to combine aspects of both 
great selectivity and extreme flexibility. 

With the M and E proteins as the fixed minimal requirement, co- 
ronaviral particles appear to tolerate the presence of all other viral 
components in practically every possible combination. A nucleocapsid 
is not required but, if available, it can take almost any length as 
defective (including chimeric) genomes of largely varying sizes have 
been accommodated. RNAs need not necessarily be packaged into a 
nucleocapsid; whether of viral or nonviral origin, if provided with the 
proper packaging signal they can be taken in even in the absence of an 
N protein. 

Also in the composition of their viral envelope these viruses are 
highly flexible. Spikes seem to be incorporated in variable numbers 
depending on availability. They tolerate severe manipulation, both of 
their ectodomain and of their endodomain. Thus, swapping of ectodo- 
mains between unrelated coronaviruses (i.e., from different groups) 
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creates viable chimeric viruses whereas a foreign protein such as GFP 
appended to the S protein endodomain is accommodated in the parti- 
cle, although reluctantly. Some coronaviruses have an HE protein but 
continue growing well if for any reason the gene is not (properly) 
expressed as happens among different MHVs. Consistently, S and 
HE proteins are incorporated independent from each other. Direct 
interactions between S and HE were not observed when the proteins 
were coexpressed in cells (Nguyen and Hogue, 1997). This result is 
consistent with the idea that these proteins are separately drawn into 
the M protein lattice by their distinctive interactions with M mole- 
cules. It is also consistent with the concept that these proteins assume 
different positions within this lattice, a hypothesis based on the pre- 
sumed different geometric requirements for the incorporation of tri- 
meric and di- or tetrameric S and HE complexes, respectively. 

How, in the face of this enormous flexibility in accommodating all 
these various numbers and combinations of viral components, do co- 
ronaviruses manage to maintain specificity? Host proteins have not 
been noticed to occur in virions, although this may simply not have 
been looked at carefully enough. By probing the specificity, using viral 
and nonviral membrane proteins, it appeared that foreign proteins are 
effectively excluded from coronaviral particles (de Haan et al., 2000). 
However, some missorting was found to occur, consistent with earlier 
observations (Yoshikura and Taguchi, 1978). 

The picture of coronaviral envelope formation is one that is directed 
entirely by lateral interactions between the envelope proteins. In in- 
fected cells, membrane proteins—viral and cellular—are sampled for 
fit into the lattice formed by M molecules. The specificity of the molec- 
ular interactions acts as a quality control system to warrant the for- 
mation of the two-dimensional assemblies that contain the full 
complement of viral membrane proteins but from which cellular pro- 
teins are segregated. For each cellular protein the efficiency of this 
exclusion process is determined by its lack of interaction with the 
M protein, its lack of fit in the M protein framework, and its success 
in competing with the S and HE oligomers for the (geometrically 
different) vacancies within this framework. 


3. Localization of Budding 


The precise location of coronavirus budding and the factors that 
govern it have not been established. Although it is clear that particle 
formation occurs at membranes early in the secretory pathway, up to 
the cis-Golgi compartment, the precise site has not been identified for 
any coronavirus. Several considerations may explain this lack of 
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knowledge. One is that these early compartments are themselves 
rather complex and highly dynamic and have hence been difficult to 
define structurally. Another is the possible alteration of the structural 
integrity of these compartments by infection; studies of these effects 
have not been described. A third complication may be that corona- 
viruses do not behave uniformly, different viruses possibly preferring 
different membranes for budding. In this respect it may be of note that 
differences have been observed, for instance, in the intrinsic localiza- 
tion of the M proteins from IBV and MHV; they appeared to accumu- 
late on the cis and trans side of the Golgi apparatus, respectively 
(Klumperman e¢ al., 1994; Machamer e¢ al., 1990). 

It has long been assumed that the M protein determines the site of 
coronavirus budding. When, however, this protein appeared to localize 
beyond this site, the idea became attractive that the association of the 
envelope proteins may create the novel targeting signals that direct 
these multimeric complexes to the budding site. In support of such a 
notion is the fact that the S and HE proteins, when coexpressed with 
the M protein, are retained in the Golgi apparatus rather than being 
transported to the plasma membrane. The critical question now is 
whether and how the E protein affects the localization of the M pro- 
tein. As the E protein by itself does not seem to localize to the virion 
budding site it will be of great interest to determine the membranes at 
which VLPs assemble. 

As the envelope proteins can direct particle formation by them- 
selves, it may seem that the nucleocapsid is not leading the assembly 
process. Still, besides giving rise to virions rather than VLPs, its 
involvement in assembly might have important consequences. First 
of all, nucleocapsids may enhance the efficiency of the budding process. 
The physical yields of VLPs obtained by coexpression of the envelope 
proteins in cells are generally poor. Although there may be many 
reasons for this, in infected cells the availability of nucleocapsids is 
likely to facilitate particle production. Empty particles, considered to 
be VLPs, have nevertheless been observed during natural infection 
(Afzelius, 1994; Macnaughton and Davies, 1980). Their formation 
might simply serve as a means to dispose of excess viral membrane 
proteins from infected cells if required. 

Another effect of the nucleocapsid could involve the localization of 
budding. It is conceivable that, unless a defined budding station is 
created by a specific interplay between viral and host proteins (for 
which no indications yet exist), preassembled nucleocapsids dock 
at those intracellular membrane sites where sufficiently sized patches 
of M protein-based envelope structure have accumulated. Early in 
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infection such patches might start to form only after the envelope 
proteins have left the endoplasmic reticulum and become concentrated 
in intermediate membranes on their way to the Golgi complex. Later, 
when viral protein synthesis increases, this density might be reached 
earlier, perhaps explaining the observed late budding in the endoplas- 
mic reticulum (Goldsmith et al., 2004; Klumperman et al., 1994; Tooze 
et al., 1984). It will again be interesting to learn where VLPs indepen- 
dently bud and how this relates to the local density of the envelope 
proteins because this information will shed light on the role of the 
nucleocapsid in the localization of virion assembly. 


V. PERSPECTIVES 


The picture of coronavirus assembly that the available literature 
allows us to draw in this review is still a rough draft. We know 
the identity and some characteristics of the key elements of the pic- 
ture, we know the relative positions and orientations of most of them, 
but we are unable to fit them all into a sensible composition. 

Although this may seem like a discontented retrospective, it certainly 
is not. In the 25 years that the senior author has been in coronavirology 
research, enormous progress has been made in practically all its aspects 
including virion assembly. However, the rewarding act of compiling and 
ordering the available information and trying to abstract from it actual 
knowledge was at the same time a sharp and recurrent confrontation 
with the unknown. We want to conclude this work by summarizing 
what in our opinion will be the main issues for the near future. 

With the obstacle of reverse genetics technology solved, “structure” 
will be the dominating issue of the next decade. Biology has taught 
that molecular insight into processes will eventually depend critically 
on detailed structural information. For coronavirus assembly this 
means data on the individual structural components and, particularly, 
on the virion. With respect to the former, this will be most challenging 
for the membrane proteins, notably M and E. Virions, by their appar- 
ent elasticity, have eluded structural analysis. Here, despite the still 
limited resolution to be expected, cryoelectron microscopy should pro- 
vide the urgently required insight into the structural organization of 
the particle. 

Another issue will be the cell biology of assembly. This actually 
refers to a number of poignant problems at every stage of the process. 
Starting with nucleocapsid formation we must admit that we know 
practically nothing. By which interactions and where on the genome 
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the packaging is initiated, how the wrapping of the RNA proceeds, how 
the condensation of the ribonucleoprotein structure takes place, and 
where in the cell these activities take place are all unresolved ques- 
tions. Although we seem to know more about the budding process, 
several fundamental issues are still unresolved. Obvious issues are 
the site of budding and the determinants of its location, and the 
inclusion of the nucleocapsid into the budding particle. An intriguing 
issue is the budding mechanism itself: how is membrane curvature 
generated and, particularly, how is the directionality determined. Co- 
ronaviruses, like other intracellularly budding viruses, direct their 
particles out of the cytoplasm into the organelles, that is, opposite to 
the natural direction of cellular vesicle budding. Once again, simply 
nothing is known about the governing principles. 

A third field of research that has yet to open is the contribution of 
host cellular factors to the assembly process. Work has so far been 
concentrating on the viral components and their interplay. Although 
there have been incidental indications, studies on the specific involve- 
ment of host proteins apparently had to await the development of 
appropriate technologies and these are now becoming available. 

Although the serious health threat caused by the 2002-2003 epidemic 
of SARS apparently has waned, the coronavirological community 
has welcomed the consequent increased interest in this family of 
viruses. The boost that the research in this field has since been experi- 
encing warrants an exciting future and accelerated progress with 
the elucidation of the fascinating process of coronavirus assembly. 
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